locked
Low Upload Speed with TMG 2010 Array

    Question

  • Hello,

    We have a Forefront Enterprise Array of 2 members in a 3-leg configuration with NLB on all "legs" and 1GB networking. Our internet line is 100MB/s symmetrical. We don't have NIS, HTTPS or malware inspection features enabled and the server specification is 16GB RAM, single quad core 3.0GHz Xeon.

    When we use Speedtest.net to test the internet line speed without going through the firewall array (i.e. from a laptop with a public IP, directly connected), we get results of 70-90 MB/s upload and download as expected with our line so I don't have issues with a direct connection.

    When we calculate a file transfer from the internal to perimeter network, we get transfer speeds of less than 150 MB/s and this is with GB links. We're testing large (multi-GB) transfers to and from laptops that have SSDs and servers with RAID 10 so we're not concerned that disk read/write access could be limiting (we're also using wired GB LAN, not wireless etc).

    When we use Speedtest.net from a machine on the internal network, our download speed is slower by about 20% (40-60 MB/s) but our upload speed reports as being only a few MB/s - and this is on a symmetrical 100MB line!

     

    We're at a loss as to why the download speed is a fair amount less than a direct connection and expecially why upload speed from the internal network is so low. I can't see where the bottleneck might be and why it would work in one direction. Our networking, servers and line are pretty fast and I can't see any maxing out of network or CPU performance.

    Does anyone have any ideas of what to check or where to look?

    Thanks

    Sunil

    Wednesday, August 04, 2010 1:42 PM

All replies

  • Can you advise on your environment a little? Version and service pack of the OS? Are you running with FTMG SP1?

    Is this a new installation? A new issue? Has it been OK in the past and just started playing up?

    I assume the NLB set up is from within FTMG?

    Are there any alerts in the FTMG gui dashboard?

    If you disable/disconnect one of the nodes, does the speed/performance metrics change?

    If you remove NLB (I know it may be painful) and use a single node, does the issue persist?

    Anything reported from the best practice analyser?


    Keith Alabaster - MVP/Forum Moderator
    Thursday, August 05, 2010 8:38 PM
  • Hello,

    Thanks for the response, here are some further details:

    > Version and service pack of the OS? Are you running with FTMG SP1?
    This is Forefront TMG 2010 RTM running on fully patched Windows Server 2008 R2 x64 Standard. I couldn't imagine TMG RTM being released with a bug that would cause such a slow limited upload, there would be far more noise on the internet about it, especially as it is supposed to be used in enterprise environments where internet/line speed is much higher than 100Mb/s.

    > Is this a new installation? A new issue? Has it been OK in the past and just started playing up?
    This installation of the TMG Array has actually been in production for some time, we didn't test the bandwidth at the time it was installed, it was only recently when we upgraded our internet line to 100mb/s and wanted to verify that we were actually getting this speed that we notice the asymmetry in the upload. I think we have always had this problem (and just not been aware of it).

    > I assume the NLB set up is from within FTMG?
    Yes, this is TMG integrated NLB (using Windows/Microsoft NLB).

    > Are there any alerts in the FTMG gui dashboard?
    There are no alerts I can see that would affect the upload. We do have alerts but they are for unrelated things (the odd compression failure, active FTP etc, nothing that worries me or stands out)I have also  monitored the alerts when I have done the speed tests to see whether any errors or warnings occur then (such as maximum connections or any other "maximum" reached) but none appear.

    > If you disable/disconnect one of the nodes, does the speed/performance metrics change?
    This is an interesting one. What we did was to bring up another TMG server as a completely separate standalone gateway (on different but very high spec hardware) and we kept the configuration as minimal as possible (just enough for a connection). When we did a speed test through this gateway, we received higher download speeds (along the lines of a direct external connection) but the upload was still limited to statistically the exact upload speed of the arruy. So in this sense if we disconnect one of the nodes, I suspect there will be no change as we have already tested with a standalone TMG gateway

    > If you remove NLB (I know it may be painful) and use a single node, does the issue persist?
    In a word: no - from above, we tried bringing up a standalone TMG server and noticed that the upload speed was still asymmetrical and still very low

    > Anything reported from the best practice analyser?
    We've run the TMG BPA on one of the nodes and there is nothing that stands out, it re-iterates a lot of the errors from the TMG dashboard but there are no errors where the client running the speed test (from the internal network) is being limited. For our testing stand-alone server we turned off a lot of the IP flood mitigation settings and other IP-level options to no effect.

    All the reasons we can think of as to why there may be a limit should act symmetrically, ie. if is it fast in one direction, why is it so slow in the other?

    We've tried basic SOHO firewall boxes as alternate gateways too and these also give symmetrical upload and download speeds so all I can say is that the internet and our network is fine (as we replicated the path from the client to the testing gateways to our external switch - this is the same path as for the production TMG Array). So the fault lies with the TMG/OS/Hardware combo, the NLB has been eliminated (as the standalone displays the same problem).

    Friday, August 06, 2010 9:53 AM
  • Thanks for the refreshingly concise responses. Just a couple of additional questions....

    What is being used for the speed test - is this a test to a remote web site or a test provided by the ISP/provider? Is the same discrpancy seen when performing a file upload/download? i.e. is web traffic and non-web traffic equally affected for performance? If you run the speed tests from the FTMG node itself - same issue?

    If you install something like Net Monitor/Wireshark on the FTMG nodes are you noticing a high number of retransmit requests or other nasty traffic events?

     


    Keith Alabaster - MVP/Forum Moderator
    Saturday, August 07, 2010 7:48 AM
  • Hello Keith,

    We're using a range of web tools for the speed test (SpeedTest and MyBroadBandSpeed) at first I was dubious as to their reliability but over the time that I have used them for testing, they are actually not bad, there is some variability in the results but with the testing we have done with various scenarios I am confident that these are not the reason for the asymmetry in the upload speed.

    I have also installed TMG SP1 on the testing stand-alone TMG server and that has not made a difference. I have also installed Net Monitor 3.4 and did a packet capture during the period a client performed a speed test and filtering by TCP Retransmit errors actually gave back no results. Performing the speedtests from the TMG server itself also made no difference.

    I think the "upload" speed of all traffic going through/out of the TMG server is greatly reduced, a while back I tested a file transfer from a machine behind TMG to one in front of it on the public network (SMB) and this was also much slower than if both machines were behind the TMG. I remember that the transfer speed was significantly slower (much more than taking into account a small performance hit because of going through the TMG array).

    I might have to bring up another server as an ISA server and compare the differences but that is a lot of work :-(

    Sunil
    Monday, August 09, 2010 10:37 AM
  • Thanks for the further update.  Sadly my test lab does not boast a 100Mb link (I wish), and is insufficient for me to try and reproduce your timing issues because the variance between one method and the other is likely to be so small that it is not really proving anything one way or the other.

    What I can do though is escalate this accordingly and try to get some eyes focussed who certainly will have the infrastructure to see if this is indeed a repeatable situation. I may need to come back to you asking you to provide a repro output from the BPA.

    In the meantime, on one of your FTMG nodes (or the standalone unit), if you stop the fweng service thereby essentially making it a standard server again, does the performance also improve back to expected throughputs when an upload is attempted from the FTMG box itself?

    Keith


    Keith Alabaster - MVP/Forum Moderator
    Monday, August 09, 2010 4:29 PM
  • Hi Sunil,

    I have been in contact with the FTMG Programme Manager who has advised that this is not an issue that has been reported (in the sense of a frequently seen issue). That is not to say that this is the first time it has been reported, just that the cases to date have proven to be localised to the site environment rather than an FTMG product issue per se.

    Is it possible to provide a sanitised output from the FTMG standalone box for ipconfig /all and route print? It does sound as though you have been extremely diligent to date in testing but just to remove the obvious causes of performance issues would be good.

    Did you manage to try an upload from the FTMG node without the firewall engine running?

    Keith


    Keith Alabaster - MVP/Forum Moderator
    Tuesday, August 10, 2010 4:55 PM
  • Hello Keith,

    Thanks for keeping with this. I did a "quick" reversion to ISA Server 2006 and there were no changes (improvements) there (still the same numbers) so I have gone back to look at our standalone TMG testing server in more detail. As a quick aside (to see whether the hardware was limiting) I tried a few Linux firewall-centric distributions (Untangle and ZeroShell). The best result we've had is with ZeroShell, which managed 98MB/s down and 75 up, so the network card we're using (Intel Pro 1000 Quad Port PT Server adapter) does seem to be able to cope with the thoughput. Untangle managed 50 MB/s symmetric.

    I'm going to be on holiday until into next week which is why you wont hear from me for a little while but I'll pick this up when I get back and will try out your suggestions. I just wish we had more bandwidth on our other line to compare as the size of its pipe is close to the TMG upload limit we're hitting.

    Here's "ipconfig /all":

     

    Windows IP Configuration

    Host Name . . . . . . . . . . . . : TMG-Testing
    Primary Dns Suffix . . . . . . . : spiralsoft.local
    Node Type . . . . . . . . . . . . : Hybrid
    IP Routing Enabled. . . . . . . . : Yes
    WINS Proxy Enabled. . . . . . . . : No
    DNS Suffix Search List. . . . . . : spiralsoft.local

    Ethernet adapter External Line to NTL:

    Connection-specific DNS Suffix . :
    Description . . . . . . . . . . . : Intel(R) PRO/1000 PT Quad Port Server Adapter #4
    Physical Address. . . . . . . . . : 00-15-17-7D-F7-9D
    DHCP Enabled. . . . . . . . . . . : No
    Autoconfiguration Enabled . . . . : Yes
    IPv4 Address. . . . . . . . . . . : 81.106.220.58(Preferred)
    Subnet Mask . . . . . . . . . . . : 255.255.255.224
    Default Gateway . . . . . . . . . : 81.106.220.33
    DNS Servers . . . . . . . . . . . : 194.168.4.100
    194.168.8.100
    NetBIOS over Tcpip. . . . . . . . : Enabled

    Ethernet adapter Spiral Internal Network:

    Connection-specific DNS Suffix . :
    Description . . . . . . . . . . . : Intel(R) PRO/1000 PT Quad Port Server Adapter #3
    Physical Address. . . . . . . . . : 00-15-17-7D-F7-9F
    DHCP Enabled. . . . . . . . . . . : No
    Autoconfiguration Enabled . . . . : Yes
    Link-local IPv6 Address . . . . . : fe80::7968:fab4:4c0d:f4f7%17(Preferred)
    IPv4 Address. . . . . . . . . . . : 172.24.1.31(Preferred)
    Subnet Mask . . . . . . . . . . . : 255.255.0.0
    Default Gateway . . . . . . . . . :
    DHCPv6 IAID . . . . . . . . . . . : 469767447
    DHCPv6 Client DUID. . . . . . . . : 00-01-00-01-13-F5-E5-FC-00-18-FE-7F-54-DE

    DNS Servers . . . . . . . . . . . : 172.24.1.1
    172.24.1.2
    NetBIOS over Tcpip. . . . . . . . : Enabled

    Ethernet adapter External Line to BT:

    Connection-specific DNS Suffix . :
    Description . . . . . . . . . . . : Intel(R) PRO/1000 PT Quad Port Server Adapter #2
    Physical Address. . . . . . . . . : 00-15-17-7D-F7-9E
    DHCP Enabled. . . . . . . . . . . : No
    Autoconfiguration Enabled . . . . : Yes
    IPv4 Address. . . . . . . . . . . : 217.33.181.103(Preferred)
    Subnet Mask . . . . . . . . . . . : 255.255.255.128
    Default Gateway . . . . . . . . . : 217.33.181.1
    DNS Servers . . . . . . . . . . . : 194.72.6.57
    194.73.82.242
    NetBIOS over Tcpip. . . . . . . . : Enabled

    Ethernet adapter DMZ Network:

    Connection-specific DNS Suffix . :
    Description . . . . . . . . . . . : Intel(R) PRO/1000 PT Quad Port Server Adapter
    Physical Address. . . . . . . . . : 00-15-17-7D-F7-9C
    DHCP Enabled. . . . . . . . . . . : No
    Autoconfiguration Enabled . . . . : Yes
    Link-local IPv6 Address . . . . . : fe80::b45b:bc98:d1d:a32d%15(Preferred)
    IPv4 Address. . . . . . . . . . . : 192.168.100.10(Preferred)
    Subnet Mask . . . . . . . . . . . : 255.255.255.0
    Default Gateway . . . . . . . . . :
    DHCPv6 IAID . . . . . . . . . . . : 369104151
    DHCPv6 Client DUID. . . . . . . . : 00-01-00-01-13-F5-E5-FC-00-18-FE-7F-54-DE

    DNS Servers . . . . . . . . . . . : 192.168.100.51
    192.168.100.52
    NetBIOS over Tcpip. . . . . . . . : Enabled

    Tunnel adapter 6TO4 Adapter:

    Connection-specific DNS Suffix . :
    Description . . . . . . . . . . . : Microsoft 6to4 Adapter
    Physical Address. . . . . . . . . : 00-00-00-00-00-00-00-E0
    DHCP Enabled. . . . . . . . . . . : No
    Autoconfiguration Enabled . . . . : Yes
    IPv6 Address. . . . . . . . . . . : 2002:516a:dc3a::516a:dc3a(Preferred)
    IPv6 Address. . . . . . . . . . . : 2002:d921:b567::d921:b567(Preferred)
    Default Gateway . . . . . . . . . : 2002:c058:6301::c058:6301
    DNS Servers . . . . . . . . . . . : 194.72.6.57
    194.73.82.242
    NetBIOS over Tcpip. . . . . . . . : Disabled
    ================================================================================

    Here is "route print":

    ===========================================================================
    Interface List
    19...00 15 17 7d f7 9d ......Intel(R) PRO/1000 PT Quad Port Server Adapter #4
    17...00 15 17 7d f7 9f ......Intel(R) PRO/1000 PT Quad Port Server Adapter #3
    16...00 15 17 7d f7 9e ......Intel(R) PRO/1000 PT Quad Port Server Adapter #2
    15...00 15 17 7d f7 9c ......Intel(R) PRO/1000 PT Quad Port Server Adapter
    1...........................Software Loopback Interface 1
    31...00 00 00 00 00 00 00 e0 Microsoft 6to4 Adapter
    ===========================================================================

    IPv4 Route Table
    ===========================================================================
    Active Routes:
    Network Destination Netmask Gateway Interface Metric
    0.0.0.0 0.0.0.0 217.33.181.1 217.33.181.103 257
    0.0.0.0 0.0.0.0 81.106.220.33 81.106.220.58 258
    81.106.220.32 255.255.255.224 On-link 81.106.220.58 258
    81.106.220.58 255.255.255.255 On-link 81.106.220.58 258
    81.106.220.63 255.255.255.255 On-link 81.106.220.58 258
    127.0.0.0 255.0.0.0 On-link 127.0.0.1 306
    127.0.0.1 255.255.255.255 On-link 127.0.0.1 306
    127.255.255.255 255.255.255.255 On-link 127.0.0.1 306
    172.24.0.0 255.255.0.0 On-link 172.24.1.31 266
    172.24.1.31 255.255.255.255 On-link 172.24.1.31 266
    172.24.255.255 255.255.255.255 On-link 172.24.1.31 266
    192.168.100.0 255.255.255.0 On-link 192.168.100.10 266
    192.168.100.10 255.255.255.255 On-link 192.168.100.10 266
    192.168.100.255 255.255.255.255 On-link 192.168.100.10 266
    217.33.181.0 255.255.255.128 On-link 217.33.181.103 257
    217.33.181.103 255.255.255.255 On-link 217.33.181.103 257
    217.33.181.127 255.255.255.255 On-link 217.33.181.103 257
    224.0.0.0 240.0.0.0 On-link 127.0.0.1 306
    224.0.0.0 240.0.0.0 On-link 192.168.100.10 266
    224.0.0.0 240.0.0.0 On-link 81.106.220.58 258
    224.0.0.0 240.0.0.0 On-link 217.33.181.103 257
    224.0.0.0 240.0.0.0 On-link 172.24.1.31 266
    255.255.255.255 255.255.255.255 On-link 127.0.0.1 306
    255.255.255.255 255.255.255.255 On-link 192.168.100.10 266
    255.255.255.255 255.255.255.255 On-link 81.106.220.58 258
    255.255.255.255 255.255.255.255 On-link 217.33.181.103 257
    255.255.255.255 255.255.255.255 On-link 172.24.1.31 266
    ===========================================================================
    Persistent Routes:
    Network Address Netmask Gateway Address Metric
    0.0.0.0 0.0.0.0 217.33.181.1 Default
    0.0.0.0 0.0.0.0 81.106.220.33 Default
    ===========================================================================

    IPv6 Route Table
    ===========================================================================
    Active Routes:
    If Metric Network Destination Gateway
    31 1110 ::/0 2002:c058:6301::c058:6301
    1 306 ::1/128 On-link
    31 1010 2002::/16 On-link
    31 266 2002:516a:dc3a::516a:dc3a/128
    On-link
    31 266 2002:d921:b567::d921:b567/128
    On-link
    15 266 fe80::/64 On-link
    17 266 fe80::/64 On-link
    17 266 fe80::7968:fab4:4c0d:f4f7/128
    On-link
    15 266 fe80::b45b:bc98:d1d:a32d/128
    On-link
    1 306 ff00::/8 On-link
    15 266 ff00::/8 On-link
    17 266 ff00::/8 On-link
    ===========================================================================
    Persistent Routes:
    None


    We have the ISP redundancy feature enabled in failover mode only, turning this on and off has no effect. I'll be looking at this a little today and then I'll be off and back on Tuesday.

    Sunil

    Thursday, August 12, 2010 11:09 AM
  • Hello Keith,

    Here's an interesting result: For the stand-alone testing firewall, I tried turning the firewall service OFF and testing the speed from the TMG server itself and the speed tests come out at around 78 MB/s download and 43 MB/s upload i.e. the upload speed is much higher. When the Firewall Service is turned back on again, we get a much higher 90+ MB/s download but the upload is still crawling at 6-7 MB/s.

    So downloads are faster with the TMG firewall ON, but this makes the upload crawl and downloads are slower but uploads are much faster with the firewall OFF. There's still only one case (downloads with firewall on) where we're getting anything like the full bandwidth of our pipe reported.

    Sunil

    Thursday, August 12, 2010 12:15 PM
  • Can you review your DNS Settings?

    DNS settings for the FTMG servers should ONLY point to the internal DNS Servers - I see that you have reference to external DNS servers on the external-facing nics. To the best of my knowledge, the fact that you are using NLB or ISP-R does not alter the fact that the TMG nodes should know nothing about external DNS ip addresses. Your internal DNS forwarders deal with external DNS name resolution.

    Would also be interested to see if disabling IPv6 made any diffrence to you as well. IPv6 has given me a number of headaches over the past few months in respect to some anomalies (only one impacted performance).

     


    Keith Alabaster - MVP/Forum Moderator
    Saturday, August 14, 2010 6:14 AM
  • Can you verify that speed and duplex settings are defined correctly (and the same) on the NIC and switch ports?

    TMG NICs should be configured as follows: http://blog.msedge.org.uk/2010/04/recommended-network-card-configuration_14.html

    Only one NIC should have a default gateway configured; the other gateway is defined within the ISP-R configuration, not at the NIC level.

    Cheers

    JJ


    Jason Jones | Forefront MVP | Silversands Ltd | My Blogs: http://blog.msedge.org.uk and http://blog.msfirewall.org.uk
    Tuesday, August 17, 2010 12:18 AM
  • Hello Keith and Jason,

    I'm now back from a brief holiday and back to wrestling with the TMG problem.

    Thanks for your suggestions about turning off IPv6 and the NIC settings. We have IPv6 turned off on the external NIC and our internal network uses IPv4 so there should be no IPv6 to 4 translation delays/issues as it shouldn't be being used. I changed the network settings (specifically the DNS) on the external NICs by removing the DNS configuration as per Jason's link to MS Edge. Both these suggestions had no disernible effect on the upload speed, it remains limited to about 6 MB/s while downloads are up at around 94 Mb/s this morning.

    We have also spent an afternoon last week looking at the NIC settings such as the Flow Control and the Speed (Auto) Negotiation both on the testing TMG server and the switches (internal, external) it is connected to. We found the best setting was to keep the Auto Negotiation on. The LAN side of the BT ISP router is also a gigabit port (we have gigabit networking throughout with HP ProCurve switches). Deviating the settings from Auto Negotiate either caused a reduction in the download speed (sometimes dramatically) and/or packet errors that the switches could count in their Port Stats (I'm assuming these cause TCP retransmit errors).

    Sunil
    Tuesday, August 17, 2010 7:49 AM
  • In my experience, duplex settings are the biggest cause of asynchronus speed issues, so good you have looked at this. Personally, I would force speed/duplex settings on both the NIC and the switch ports.

    DNS (as discussed) is also another likely issue; as is the PathMTU setting.

    Are you running TMG SP1? Have you run the TMG BPA to look for common problems?

    You could also look at disabling any Scalable Networking options on the NICs.

    Cheers

    JJ


    Jason Jones | Forefront MVP | Silversands Ltd | My Blogs: http://blog.msedge.org.uk and http://blog.msfirewall.org.uk
    Tuesday, August 17, 2010 9:37 AM
  • Sunil, As Jason has taken the question on now, I'll leave it for him to finish up. I'll keep a watch though on how it progresses.

    Thanks

    Keith


    Keith Alabaster - MVP/Forum Moderator
    Tuesday, August 17, 2010 4:15 PM
  • Hello,

    Thank you both for your help so far. I've opened a Support Case with Microsoft on this issue as this is beyond my ability to solve. I demonstrated the asymmetry in the uploads and downloads to the engineer and going back to the earlier question of what happens when the firewall engine is turned off: the downloads are around 95MB/s and uploads rocket up to around 65MB/s (so still not line speed but an order of magnitude better than we have currently) - the fastest we've recorded is 75MB/s up and 104 down with "ZeroShell" (a linux Live CV firewall distro).

    The current status is that the engineer has requested a lab environment to replicate the TMG set up and we'll just see how this pans out.

    I'll occasionally update the thread here with news and progress.

    Sunil

    Thursday, August 19, 2010 2:43 PM
  • Sunil, yes please - getting the results from the support call would be appreciated. Fingers crossed!!
    Keith Alabaster - MVP/Forum Moderator
    Thursday, August 19, 2010 9:36 PM
  • I will be interested to see what you find out. I have the same issue with a simple setup. Just one TMG server. Please let me know if I can provide any data.
    Wednesday, August 25, 2010 3:02 PM
  • I am experiacning the same issue, and have been able to replicate it between diffrent hardware, Dell R910 and R610. and also replicated it on a vmware VM.

    TMG Array.

    100mbps Up and Down Connection, when my TMG firewall service is enabled my Upload speed will max out at 12mbps and I loose about 10 or 15mbps off download speed.

     

    Any help on this would be apprciated. !

    Tuesday, August 31, 2010 11:47 PM
  • Hi Sunil / Exchange Street Solutions,

    Can you disable NLB completely on all Network Cards / Apply changes / Restart TMG Firewall Service on the Server and then test and update the thread with the findings.

    Also i want you to disable Malware Inspection on the Access Rule that is getting used for this communication if its enabled.

    Also if you are using any website to test, that is not a valid test for 100 MB line. Usually works for broadband home connections.

    User this :

    http://research.microsoft.com/en-us/downloads/05136260-202d-4a01-bb29-33454d0c30c2/ 

    Thanks,

    Junaid

    Saturday, September 04, 2010 3:59 PM
  • Thanks Junaid, I have opend a priority case with Microsoft, and they still cant figure it out on day 3.  I have two Cisco Boxes on the way, and if they arrive before they can figure it out, im going to have to just dump TMG.

    I have disabled everything you posiably can in TMG, from NAT, Inspection, to Logging, in fact I cant find anything else to disable, Ive added persistant routes, static arp tables. ect ect ect. disabled TCP offloading, TCP Checkup ect. ect.

    I know online tests are not accruate it wont ever show me my full 100mb pipe,  but if I disable TMG i can see 64mb, or 100 mb plus some change if I run two tests at the same time... If I have TMG Firewall enabled I get 12mbps, consistant. Everytime.

     

     

     

    Saturday, September 04, 2010 5:58 PM
  • Thanks for update and can you please share case number with me (junaid.jan@microsoft.com). These type of cases take some time as lot of data needs to be analyzed and we also may ask for data collected in different scenerios.

    Thx, Junaid

    Saturday, September 04, 2010 6:19 PM
  • Hello,

    The Support Case with Microsoft continues on the slow TMG upload issue.

    What we have done so far is to capture upload traces at the client and TMG. Initial analysis from Saurav indicates that the WebProxy Filter is taking the upload packets and repackaging them into much smaller IP fragments (i.e. -  a much higher TCP/IP overhead with less data transferred leading to a lower upload bandwidth.

    As a test I have taken the WebProxy Application filter OFF the HTTP Port-80 protocol in TMG and suddenly, upload and download speeds are SYMMETRIC (finally) - so the current progress is that we have narrowed the problem of the slow uploads to the WebProxy Application filter, but the how and why is still being investigated.

    Please note that these are just preliminary findings and that nothing has been verified yet, but if any of you have a testing TMG server can can verify that unchecking using the WebProxy filter in the HTTP protocol returns the upload speed to "normal" (symmetric if you have a symmetric line) then we would be grateful. With the WebProxy application filter turned off, we were able to have upload and download speeds of 80MB/s plus on our 100MB line (probably for the first time ever).

    Turning off the WebProxy filter does have implications though such as also disabling cacheing and application-layer filtering so in our environment, simply turning off the WebProxy filter is not really an option as we use these features.

    I would also like to re-iterate "Exchange Street Solutions" in that when we run online download/upload speed tests, we know that these are mainly for broadband connections, but we are first looking at the speed differences (80 down vs 6 up on a 100 symmetric line) and online speed tests are adqeuate enough for this.

    I'll continue to keep you updated. The turn around on Support is a little slow but packet analysis is slow and boring work (at least to me it is)

     

    Sunil

    Friday, September 10, 2010 3:46 PM
  • Thanks for the update. Pleased to hear that progress is being made.
    Keith Alabaster - MVP/Forum Moderator
    Friday, September 10, 2010 5:27 PM
  • hey guys,

    I haven't read the entire threat, but has PMTU discovery been enabled?

     

    Tom

     


    MS ISDUA/UAG DA Anywhere Access Team Get yourself some Test Lab Guides! http://blogs.technet.com/b/tomshinder/archive/2010/07/30/test-lab-guides-lead-the-way-to-solution-mastery.aspx
    Friday, September 10, 2010 6:13 PM
  • Hi Tom - thanks for jumping in. Not sure on that one.

    Sunil, can you check this out? Both Tom and Jason have suggested this is an area to investigate and verify?

    http://msdn.microsoft.com/en-us/library/ms817967.aspx


    Keith Alabaster - MVP/Forum Moderator
    Saturday, September 11, 2010 2:48 AM
  • I haven't heard any issue on this area, at least no case (so far) was open on Microsoft CSS where customers are facing this issue. I have TMG on my lab and I just did a test using the same web site (speedtest.net) and here are the results:

    Without TMG
    Download - 19.26 Mb/s
    Upload - 4.27 Mb/s
    http://www.speedtest.net/result/949328096.png

    With TMG
    Download - 20.73 Mb/s
    Upload - 4.44 Mb/s
    http://www.speedtest.net/result/949329426.png

    Will be interesting for you to use netmon and verify if the packet that leaves your workstation and goes out to this web site (passing through TMG) is using an optimized route or if there are any ICMP redirect in the middle of this conversation. The only thing that I can think of right now that could potentially cause this is networking issues or not optimized routes. Unless I'm not able to repro your issue because somehow it just happen when the upload speed is too high, but again, no official reports from others that implemented TMG in big environment and are having such issue.

    HTH,

     

     


    Yuri Diogenes [MSFT] - http://blogs.technet.com/yuridiogenes
    Saturday, September 11, 2010 4:01 PM
  • Sunil, in Tom, Jason and Yuri, you now have the attention of the very top team in respect to experience and my ability to escalate up the chain.

    Can you advise on your case number as we can then see the full details of what has been tested so far, the findings etc?

    Thanks

    Keith


    Keith Alabaster - MVP/Forum Moderator
    Saturday, September 11, 2010 4:31 PM
  • Sunil,

    I have tried the suggested test and did not get the desired results. It is posible that I am not disabling the Web Proxy Application filter properly. Can you tell me the specific steps?

     

    This is what I did:

    Under FireWall Policy, I selected the Web Access rule

    They selected properties and protocols tab

    Selected HTTP and edit

    Selected Parameters Tab

    Unchecked Applications Filters: Web Proxy Filter

    Thanks

    Sunday, September 12, 2010 5:36 AM
  • Keith,

    I heard from one of our engineers that Sunil opened a case with MS for this. I'm following up on this closely from now one and will update this thread once we have identified the root cause for the issue.

    Thanks,

     


    Yuri Diogenes [MSFT] - http://blogs.technet.com/yuridiogenes
    Monday, September 13, 2010 1:17 AM
  • Hello everyone, I just want to get all the replies to various people in a single post:

    @Keith:
    The Microsoft Support Case number for the "Slow Upload Speeds Through TMG 2010" is 110081829519845. I'm not sure what they're doing at the moment. I was supposed to hear back on Friday but didn't so I will chase them later today (after this post and a meeting).

    @ObjectB:
    It is surprising that turning off ther WebProxy Filter did not work for you. I was strongly confident that this is where the problem was. The steps you followed are the same ones I did. I wasn't able to actually disable the WebProxy Application Filter but I could stop clients from using it by turning it off in the HTTP protocol. I think I also turned off/disabled the web cache and also all clients are SecureNAT clients as opposed to proxy clients - perhaps you also need to verify these additional items as well as the WebProxy Filter.

    @Junaid:
    We are currently performing tests and working with a stand-alone TMG server since this also has the same problem of slow uploads. This means the array and NLB doesn't come into play here which limits the scope of the troubleshooting.

    @Exchange Street Solutions:
    Your Cisco boxes have probably arrived by now and you're probably using them (and enjoying full upload speed). If you still have enough of your previous TMG array running, then you could try turning off the caching and deselecting the WebProxy Application filter from the HTTP-80 protocol definition in TMG and using SecureNAT clients to see whether the upload speed returns to normal.

    @Thomas:
    Thanks to you for joining this somewhat interesting thread/problem. From what I have read, it seems that Server 2008 R2 has PMTU enabled by default and you need to use a WMI script to turn it off. If this isn't the case, then can you tell me how to enable PMTU (and whether there is something similar for (HP ProCurve) switches).

    @Yuri
    Thanks to you for joining this somewhat interesting thread/problem. The results you got with the speed test are interesting - are you on a 20Meg symmetrical line? You're getting similar results with and without TMG but this is not what we have encountered in our scenario. We get almost-symmetrical and near line-speed upload and downloads without TMG but with it we have our uploads limited to about 6MB/s out of 100MB/s. I still think this is limited to something TMG is doing as we have already eliminated the line and ISP router and switches systematically by trying other firewalls in place of TMG, only TMG has this asymmetry problem in our environment.

    @Everyone
    If you can follow the Support Case, the number is 110081829519845. At the moment, I need to get back in touch with Saurav Datta who I expected to contact me last week. I will be sending him an email today to try and get this moving again. We have captured packet traces of data being uploaded to a web site from the client and captured the trace as this traffic passed through the TMG.
    Wednesday, September 15, 2010 8:36 AM
  • I got a customer to test a recent TMG array deployment with a 100MBps link and they achieved similar upload/download speeds of about 80-90 Mbps.

    HOWEVER, this was only possible after configuring the Hyper-V host upon which the TMG guests run with the correct NIC speed/duplex settings...until they did this, they got very poor upload speeds from TMG...

    I had the exact same problem with their ISA2k6 deployment (which used physical servers) until they configured the ISA NIC and switches to agree on speed/duplex settings; in pretty much every case I have seen, aysmmetric throughtput speed issues have always been down to duplex settings...I really hope MS can get to the bottom of this one...

    Any use?

    Cheers

    JJ

     


    Jason Jones | Forefront MVP | Silversands Ltd | My Blogs: http://blog.msedge.org.uk and http://blog.msfirewall.org.uk
    Wednesday, September 15, 2010 6:08 PM
  • Thanks Jason !!

    We are almost done in the investigation and as soon as we provide the solution for the case we will share more details in this thread.

    Thanks,

     


    Yuri Diogenes [MSFT] - http://blogs.technet.com/yuridiogenes
    Wednesday, September 15, 2010 6:12 PM
  • Sunil, After further testing and adjustments, I am getting the results you expected. First, I set my external NIC to half duplex, then I tested from a client. When I tested on the TMG server itself, I did not get these results, but on the client box, I get the exact results. I even flipped Web Proxy Filter on and off several times just to make sure. Seems like you have it isolated to the Web Proxy Filter.

    Sunday, September 19, 2010 2:23 PM
  • Interesting info - but lets wait till the results are published by Yuri and the rest of the team once they have completed their findings and anlysis. The impact of incorrectly set duplex and nic settings is well documented - and published in many places - so I would be surprised if it is quite that simple; if it was I think this would have been closed as a question quite some time ago, so I'll hold off the cheers just for the moment.
    Keith Alabaster - MVP/Forum Moderator
    Sunday, September 19, 2010 3:02 PM
  • Hi.

    I experienced almost the same issue. Running several mail/webserver behind a TMG2010sp1 (2008r2) with two ISPs. The problem was narrowed down to NIC speeds. The vmware host defaults new interfaces to 10Gbps but the actual host bandwidth is 1Gbps. The uplink to the ISP is 50/50Mbit.  After changing the interface speeds on the TMG-server to 100Mbit/full all issues are resolved. I guess Windows optimizes for 10Gbit links and it leads to alot of retransmits when the physcial link is quite different...

     

     

    Monday, September 20, 2010 8:30 AM
  • Well; this week I was on a long call with Microsoft Support and there were changes made around the buffer size for the Web Proxy application filter. This involved a few registry entries and running a vbs script that changed some TMG internal settings. I don't really know what exactly was changed but the result is that the testing standalone TMG server that we have been working on can now manage a consistent 90-95Mbit/s DOWN and 46 Mbit/s UP, this is an increase from 6MBit/s UP, so this is a huge improvement from before.

    I have asked for a summary and instructions of the changes made as I want to apply these changes to our production TMG Array but I haven't heard anything back yet.

    At this point I'm not sure whether the increase in upload speed is enough to satisfy me or my boss and I am in two minds about whether a hardware appliance would give better performance.

    As this is not a widespread issue from what I've been told, the level at which we're having to dig down to to get to the bottom of this problem doesn't seem to be worth it right now.

    Sunil

    @Findahl: We've already tried a variety of NIC speed settings on the TMG and switch and anything other that "Auto Negotiate" gives very poor performance and TMG is running on physical hardware so this isn't an issue for us, thanks for the suggestions anyway.

    Thursday, September 23, 2010 10:05 AM
  • It would be great if you could share these registry changes and script code...
    Jason Jones | Forefront MVP | Silversands Ltd | My Blogs: http://blog.msedge.org.uk and http://blog.msfirewall.org.uk
    Thursday, September 23, 2010 10:32 AM
  • We are working on a blog post to document this, it should be out next week or so.
    Yuri Diogenes [MSFT] - http://blogs.technet.com/yuridiogenes
    Friday, September 24, 2010 3:48 PM
  • I think I'll wait for the Blog Post that will document the fixes as there may be consequences from the registry changes and vbs script that I am not fully aware of and that would be much better covered and explained in the blog post.

    Sunil

    Tuesday, September 28, 2010 8:33 AM
  • I think I'll wait for the Blog Post that will document the fixes as there may be consequences from the registry changes and vbs script that I am not fully aware of and that would be much better covered and explained in the blog post.

    Sunil


    Ok, fair enough :)
    Jason Jones | Forefront MVP | Silversands Ltd | My Blogs: http://blog.msedge.org.uk and http://blog.msfirewall.org.uk
    Tuesday, September 28, 2010 11:40 AM
  • I think I'll wait for the Blog Post that will document the fixes as there may be consequences from the registry changes and vbs script that I am not fully aware of and that would be much better covered and explained in the blog post.

    Sunil


    Ok, fair enough :)
    Jason Jones | Forefront MVP | Silversands Ltd | My Blogs: http://blog.msedge.org.uk and http://blog.msfirewall.org.uk
    Tuesday, September 28, 2010 11:40 AM
  • We are working on a blog post to document this, it should be out next week or so.
    Yuri Diogenes [MSFT] - http://blogs.technet.com/yuridiogenes

    Good stuff!
    Jason Jones | Forefront MVP | Silversands Ltd | My Blogs: http://blog.msedge.org.uk and http://blog.msfirewall.org.uk
    Tuesday, September 28, 2010 11:40 AM
  • Hallo

    We have the exact same problem with several isa 2006 and tmg servers,

    we can't get more then maximum 20mbit in upload behind a isa/tmg server

    are there any news about this, hope you really soon find a solution and post it here.

    Valle

    Monday, October 04, 2010 7:41 AM
  • Over the weekend I made the change to our TMG Array on the production side and saw the upload speed go from 7Mbit/s up to around 48Mbit/s on our 100Mbit line (download was 91Mbit/s). I don't think we'll get any more in terms of upload speed out of the Array, which is a shame but nevermind.

    I've been advised against posting the fixes from Microsoft Support, which is why I haven't described what I did so we're all hanging on Yuri to post his blog entry on this topic to enlighten everyone.

    Sunil

    Monday, October 04, 2010 11:54 AM
  • Yuri - any news on this?
    Keith Alabaster - MVP/Forum Moderator
    Saturday, October 09, 2010 4:17 PM
  • Instead of a post on TMG Blog, we are working on an official KB for that. Will post the KB number here once it is finished.
    Yuri Diogenes [MSFT] - http://blogs.technet.com/yuridiogenes
    Tuesday, October 12, 2010 1:00 AM
  • Hi Yuri,

    Is the KB far away?

    We are also experiencing a similar. We are have a 100Mbit Ethernet Internet link directly into a Verizon POP and cannot see symetric performance. We see ~80-90Mbits down and ~40Mbits up. In our case we are using Forefront TMG 2010 Std Edition (Quad Core CPU with 8GB RAM, 40GB cache, 2x1GigE NIC running as a team, Win2008 R2) as a single homed webproxy service behind Cisco ASA firewalls.

    On direct tests theirs no issue or processing latency, the customer is happy,  via TMG their unhappy with the added latency and speed.

    Our DNS is internal.

    NIS, flood mitigation, IPS, compression, logging (was disabled, re-enabled tonight to assist with request processing statistics) and most if not all web and application filters are disabled. We are running a Websense Filter plugin yet disabling this does make any impact.

    Download\upload performance with ISA 2006 (Ent not Std) was similar yet we found latency was better and the customer happier.

    Wednesday, October 13, 2010 1:28 PM
  • Unfortunetly it is going to take a bit more due the tests that Product Team is currently doing on this. Although we might have a workaround, we are working towards the root cause to fix the issue.
    Yuri Diogenes [MSFT] - http://blogs.technet.com/yuridiogenes
    Friday, October 15, 2010 2:12 AM