none
TCP Chimney Offload Should this be enabled / disabled on HyperV Cluster Nodes

    Pergunta

  • Hi,

    I am phasing intermittend packet drops on one of the VM on the HyperV Cluster, this cluster is based on 5 Win2008 R2 Ent SP1 - Core Edition Servers.

    And i have a VM on one of the node which is running ETL Jobs and some of the jobs fails and we discovered that it happens because of packet drops.

    Should TCP Offload be enabled / disabled on the HyperV Nodes & VMs?

    Does the VNIC Adapter causes the packet drop?

    What is the recommended type of VNIC (Synthetic / Emulated) for VMs which does lot of transaction based activities?

    Regards,

    Maqsood


    Maqsood Mohammed Senior Systems Engineer MCITP-Enterprise Admin & ITILv3 Foundation Certified

    quarta-feira, 21 de março de 2012 06:49

Respostas

  • Hiya,

    Heres how it usually goes in terms of network issues with Hyper-V

    Type of Adapter
    You start with the Synthetic adapter type, as it should be better performance. However some applications have problems using this adapter type. You might want to change to emulated for test this.
    Answer: The recommended is what works, start with Synthetic.

    Here is a breif description of the two types and settings.
    How to Configure Network Adapters for a Virtual Machine
    http://technet.microsoft.com/en-us/library/cc917873.aspx

    Type of host network
    Usually one of the first thing which will be asked, is if your running a set of teamed NIC's on your host machine(s). If this is the case, you will be asked to break this team and/or update with latest drivers from manufacturers homepage.
    Answer: Make sure to test if you still experience the problem if your not running teamed NIC's.

    TCP Offloading or not
    In most cases it will work running with the default Offloading setup for the O/S and/or NIC's. However alot of people have reported that disabling offloading has increased stability and/or performance.

    TCP Offloading again?!
    http://blogs.msdn.com/b/psssql/archive/2010/02/21/tcp-offloading-again.aspx

    Information about the TCP Chimney Offload, Receive Side Scaling, and Network Direct Memory Access features in Windows Server 2008
    http://support.microsoft.com/kb/951037

    Bottomline is, if you experience problems like your describing, disabling Offloading might be a very good place to start.

    quarta-feira, 21 de março de 2012 07:42
  • TCP Chimney should be left on and left alone.

    TCP Task Offload settings should only be modified if your VM workloads exhibit certain symptoms.  (commonly experienced by Terminals Server, file server, Exchange, and SQL VMs).

    In your case it sounds as if the application in the VM might operate better if TCP Task Offload settings were disabled (Task offload, large send offload, etc - NOT TCP Chimney)

    These are very distinct and very different settings.  Disabling Chimney will not help the symptoms that disabling Task Offload cures.

    In most cases try updating the NIC firmware and NIC drivers before disabling these settings.

    http://itproctology.blogspot.com/2011/03/tcp-checksum-offload-is-not-equal-to.html


    Brian Ehlert (hopefully you have found this useful)
    http://ITProctology.blogspot.com
    Learn. Apply. Repeat.
    Disclaimer: Attempting change is of your own free will.


    quarta-feira, 21 de março de 2012 15:12

Todas as Respostas

  • Hiya,

    Heres how it usually goes in terms of network issues with Hyper-V

    Type of Adapter
    You start with the Synthetic adapter type, as it should be better performance. However some applications have problems using this adapter type. You might want to change to emulated for test this.
    Answer: The recommended is what works, start with Synthetic.

    Here is a breif description of the two types and settings.
    How to Configure Network Adapters for a Virtual Machine
    http://technet.microsoft.com/en-us/library/cc917873.aspx

    Type of host network
    Usually one of the first thing which will be asked, is if your running a set of teamed NIC's on your host machine(s). If this is the case, you will be asked to break this team and/or update with latest drivers from manufacturers homepage.
    Answer: Make sure to test if you still experience the problem if your not running teamed NIC's.

    TCP Offloading or not
    In most cases it will work running with the default Offloading setup for the O/S and/or NIC's. However alot of people have reported that disabling offloading has increased stability and/or performance.

    TCP Offloading again?!
    http://blogs.msdn.com/b/psssql/archive/2010/02/21/tcp-offloading-again.aspx

    Information about the TCP Chimney Offload, Receive Side Scaling, and Network Direct Memory Access features in Windows Server 2008
    http://support.microsoft.com/kb/951037

    Bottomline is, if you experience problems like your describing, disabling Offloading might be a very good place to start.

    quarta-feira, 21 de março de 2012 07:42
  • TCP Chimney should be left on and left alone.

    TCP Task Offload settings should only be modified if your VM workloads exhibit certain symptoms.  (commonly experienced by Terminals Server, file server, Exchange, and SQL VMs).

    In your case it sounds as if the application in the VM might operate better if TCP Task Offload settings were disabled (Task offload, large send offload, etc - NOT TCP Chimney)

    These are very distinct and very different settings.  Disabling Chimney will not help the symptoms that disabling Task Offload cures.

    In most cases try updating the NIC firmware and NIC drivers before disabling these settings.

    http://itproctology.blogspot.com/2011/03/tcp-checksum-offload-is-not-equal-to.html


    Brian Ehlert (hopefully you have found this useful)
    http://ITProctology.blogspot.com
    Learn. Apply. Repeat.
    Disclaimer: Attempting change is of your own free will.


    quarta-feira, 21 de março de 2012 15:12
  • Thanks everyone for the response.

    I have planned for the Firmware Upgrade & Drivers Upgrade on the HyperV Servers and i will update the status after testing.

    Regards,

    Maqsood


    Maqsood Mohammed Senior Systems Engineer MCITP-Enterprise Admin & ITILv3 Foundation Certified

    quinta-feira, 22 de março de 2012 07:58
  • Hello Guys,

    I did the firmware upgrade but still there is no difference the packet drop still continues.

    Just to brief you the test scenario.

    We are using IBM Blade Center H - Services and HS21 & HS22 Blades, i tested the PING from a rack mount server in the same LAN as Balde center to the following servers on the blade center.

    1 - Domain Controller (Physical Server - Stand Alone)

    2 - HyperV Cluster Node 

    3 - SQL Cluster Node (Physical Server)

    4 - Virtual Machine on HyperV Cluster 

    and i see the intermittent packet drops on all of these servers not in the sequence.

    Note: All of these servers are running Win2008 R2 SP1 OS.

    Is this issue related to OS / NIC? Can it be the blade center switch which is running on 3 years old firmware?

    Should i follow the instructions from this post?

    http://blogs.msdn.com/b/psssql/archive/2010/02/21/tcp-offloading-again.aspx?PageIndex=2#comments

    Regards,

    Maqsood


    Maqsood Mohammed Senior Systems Engineer MCITP-Enterprise Admin & ITILv3 Foundation Certified

    quarta-feira, 28 de março de 2012 07:52
  • Hello Guys,

    I did the firmware upgrade but still there is no difference the packet drop still continues.

    Just to brief you the test scenario.

    We are using IBM Blade Center H - Services and HS21 & HS22 Blades, i tested the PING from a rack mount server in the same LAN as Balde center to the following servers on the blade center.

    1 - Domain Controller (Physical Server - Stand Alone)

    2 - HyperV Cluster Node 

    3 - SQL Cluster Node (Physical Server)

    4 - Virtual Machine on HyperV Cluster 

    and i see the intermittent packet drops on all of these servers not in the sequence.

    Note: All of these servers are running Win2008 R2 SP1 OS.

    Is this issue related to OS / NIC? Can it be the blade center switch which is running on 3 years old firmware?

    Should i follow the instructions from this post?

    http://blogs.msdn.com/b/psssql/archive/2010/02/21/tcp-offloading-again.aspx?PageIndex=2#comments

    Regards,

    Maqsood


    Maqsood Mohammed Senior Systems Engineer MCITP-Enterprise Admin & ITILv3 Foundation Certified

    It could be virtually everything mangling packets on their path. Start isolating the issue. Replace components one-by-one with the ones known and confirmed working. I'd personally start with TCP offload options still :)

    -nismo

    quarta-feira, 28 de março de 2012 08:27
  • All right, i am starting with this.

    I am doing the following.

    1 - Disable - TCP Chimney Offload (This is on host level)

    2 - IPv4 Checksum Offload - None (On NIC Level)

    3 - IPv4 Large Scale Offload - Disable (On NIC Level)

    4 - Receive Side Scaling - Disable (On NIC Level)

    5 - TCP Connection Offload - Disable (On NIC Level)

    Please suggest me if there is anything i am doing wrong these are all production servers.

    Regards,

    Maqsood


    Maqsood Mohammed Senior Systems Engineer MCITP-Enterprise Admin & ITILv3 Foundation Certified

    quarta-feira, 28 de março de 2012 11:51
  • Maqsood: We have a very similar issue on a very similar configuration.  We are experiencing occasional dropped packets and random RSTs.  Very curious about the status of your problem.
    quinta-feira, 12 de julho de 2012 22:46
  • Hi,

    Actually i did all of this on these on HyperV Cluster Node not to leave any doubt out there.

    1 - Disable - TCP Chimney Offload (This is on host level)

    2 - IPv4 Checksum Offload - None (On NIC Level)

    3 - IPv4 Large Scale Offload - Disable (On NIC Level)

    4 - Receive Side Scaling - Disable (On NIC Level)

    5 - TCP Connection Offload - Disable (On NIC Level)

    I had also opened a case with Microsoft regarding this issue and after intensive analysis we reached to the conclusion that this problem is not related to HyperV Cluster but it was the VM itself which was over utilized.

    So i don't know what is your situation.

    Regards,

    Maqsod


    Maqsood Mohammed Senior Systems Engineer MCITP-Enterprise Admin & ITILv3 Foundation Certified

    domingo, 15 de julho de 2012 06:40