none
DPM 2010 - Terrible backup performance of VMs on a EqualLogic PS6100X RRS feed

  • Question

  • I hope I'm not asking a redundant question, and I've found several "best practice" documents regarding the implementation of a SAN with DPM 2010. Additionally, I'm a newbie when it comes to SAN technology, so I'm learning as I go.

    I finally got backups to occur for VMs that are stored on a SAN, but the backup is VERY slow; I believe it's even timing out. The way I understand that backups should occur is that the SAN takes the snapshot and DPM backs up the snapshot. I looked in the EqualLogic Group Manager, but it doesn't look like snapshots are happening. We defined the CHAP user during the Remote Setup Wizard, but the CHAP user is not defined in the Access tab of the volume. It is currently an asterisk (*) and CHAP authentication states: "not set". Does the CHAP user account need to be defined here? If so, why? Is it not enough to have it defined in the Remote Setup Wizard? If I do define this account here, will it improve backup performance on the DPM side?

    I apologize in advance for my ignorance. I just want to understand :) Thanks!

    Monday, July 30, 2012 3:50 PM

Answers

  • So the NIC finally arrived and it was installed yesterday. After updating the NIC's firmware (that's pretty important), I tried configuring the iSCSI offload engine on the NIC, but couldn't figure it out. I've run out of time with trying to get this thing working correctly. Hence, I disabled the iSCSI offload engine for all of the ports. I made some additional adjustments to the NDIS settings beyond Flow Control and the Jumbo Packet value:

    - Disabled Large Send Offload (IPv4 and IPv6)
    - Disabled TCP Connection Offload (IPv4 and IPv6)
    - Disabled TCP/UDP Checksum Offload (IPv4 and IPv6)
    - Disabled "Wake on..." stuff

    After doing all that, I made sure everything was set up correctly in Microsoft iSCSI Initiator and confirmed that the Dell EqualLogic MPIO settings were still correct.

    After firing up a VM (which is stored on the SAN), I performed a consistency check.......it was perfect!!! Screamin' fast!!!

    So, the end problem was the Broadcom NIC's iSCSI offload engine. Maybe it won't work because of how our iSCSI network was configured....maybe I have to completely disable NDIS in order to configure the iSCSI adapter? I don't know, and I don't care. All I know is that, right now, I have a machine that is performing exactly the way I want it to, and it's beautiful! At least I don't have to worry about configuring the two production Hyper-V hosts. That job is above my pay grade. :)

    Time to go melt in the sun!

    • Marked as answer by MSS_AxiomEPM Friday, August 17, 2012 8:24 PM
    Friday, August 17, 2012 8:24 PM

All replies

  • Some additional information about the environment:

    > Not using failover (Single Hyper-V Host; Single Volume on the SAN)
    > All VMs are stored on the single SAN volume
    > When attempting to create a recovery point for a VM, the SAN does not create a snapshot of the volume (is this because the CHAP user is not set on the volume?)

    I hope this helps!

    Monday, July 30, 2012 4:22 PM
  • So I've got the backups working with the EqualLogic Hardware VSS Provider. I have 3 Hyper-V host machines that I'm testing with (HyperV1, HyperV2, HyperV3)

    I'm noticing that a snapshot taken with HyperV3, with only 1 iSCSI connection, is blazingly fast. When taking a snapshot on HyperV2, with 4 iSCSI connections, it's painfully slow. Maybe this is a question for our EqualLogic vendor? I've tried checking for differences in the volumes, but I'm not finding anything

    Monday, July 30, 2012 6:30 PM
  • Our DPM server has two Broadcom GB NICs. Neither one of them are plugged in to the dedicated SAN network. We're going to update the NIC drivers on this machine, then plug one of the NICs into the SAN network. Will that improve performance? Will there be an impact to any other areas in terms of performance? Sorry if the question is vague...

    Thanks!

    Tuesday, July 31, 2012 4:28 PM
  • Well, nevermind that....just noticed that all DPM backups happen over the LAN....back to the drawing board
    Tuesday, July 31, 2012 6:00 PM
  • Actually, I found that a backup network address can be configured, so this might work out after all

    http://technet.microsoft.com/en-us/library/cc964298.aspx

    Says it's for DPM 2007, but it looks like it should apply to DPM 2010 as well.

    Tuesday, July 31, 2012 7:26 PM
  • So the overall problem was being caused by slow disk read from the EqualLogic SAN. The solution is to disable TcpAckFrequency for all NICs that carry iSCSI traffic. It also didn't hurt to configure all EQL volumes to us the Round Robin MPIO policy. Data now moves insanely fast for all iSCSI-connected machines. Happy times indeed!!!
    • Marked as answer by MSS_AxiomEPM Tuesday, July 31, 2012 11:51 PM
    • Unmarked as answer by MSS_AxiomEPM Friday, August 3, 2012 8:40 PM
    Tuesday, July 31, 2012 11:51 PM
  • So changing the TcpAckFrequency and MPIO policy was not the real solution. I'm 99.999% confident that the TRUE, FINAL ANSWER is.........drumroll please.......outdated Broadcom NetXtreme II 5709 NIC drivers *facepalm*. The drivers were never updated when the NICs were installed. Why? I have no idea. I didn't do the installation. I only came to this conclusion after digging around in the Hyper-V host's EqlTrace.log files. I was seeing discrepancies with the HBA iSCSI provider on the two problematic machines when I compared against the Hyper-V host that was working wonderfully.

    Hopefully, the drivers on the problematic Hyper-V hosts can get updated this weekend. So, as much as one might think that the dead horse has been beaten enough.....ALWAYS UPDATE YOUR NIC DRIVERS AFTER INSTALLATION!!!!

    Thank you. Have a good weekend :)

    • Marked as answer by MSS_AxiomEPM Friday, August 3, 2012 8:51 PM
    • Unmarked as answer by MSS_AxiomEPM Wednesday, August 15, 2012 7:28 PM
    Friday, August 3, 2012 8:51 PM
  • Well, back to the drawing board. Updating the NIC driver wasn't the solution, although it needed to be done. We've ordered the exact same NIC that's in our problem Hyper-V hosts (1 & 2). Those NICs were initially installed with drivers from Dell. So we'll install this NIC in the exact same fashion to try to repro the problem. From another thread in the VMM section, I found out that the Dell-Supplied NIC drivers automatically enable the iSCSI Offload Engine, which, if not configured properly, do not play nice with VSS. Here's the post I'm referring to:

    SCVMM 2008 R2 - Host Refresh causes Event ID 113 MSiSCSI events on Hyper-V Cluster

    I have a couple of solutions that I'm going to try from there, such as properly configuring the iSCSI Offload Engine (or disabling it on the NIC altogether). While I don't like learning via "trial by fire", especially in a production environment, I've certainly gained a lot of experience with trying to figure this stuff out.


    Wednesday, August 15, 2012 7:42 PM
  • So the NIC finally arrived and it was installed yesterday. After updating the NIC's firmware (that's pretty important), I tried configuring the iSCSI offload engine on the NIC, but couldn't figure it out. I've run out of time with trying to get this thing working correctly. Hence, I disabled the iSCSI offload engine for all of the ports. I made some additional adjustments to the NDIS settings beyond Flow Control and the Jumbo Packet value:

    - Disabled Large Send Offload (IPv4 and IPv6)
    - Disabled TCP Connection Offload (IPv4 and IPv6)
    - Disabled TCP/UDP Checksum Offload (IPv4 and IPv6)
    - Disabled "Wake on..." stuff

    After doing all that, I made sure everything was set up correctly in Microsoft iSCSI Initiator and confirmed that the Dell EqualLogic MPIO settings were still correct.

    After firing up a VM (which is stored on the SAN), I performed a consistency check.......it was perfect!!! Screamin' fast!!!

    So, the end problem was the Broadcom NIC's iSCSI offload engine. Maybe it won't work because of how our iSCSI network was configured....maybe I have to completely disable NDIS in order to configure the iSCSI adapter? I don't know, and I don't care. All I know is that, right now, I have a machine that is performing exactly the way I want it to, and it's beautiful! At least I don't have to worry about configuring the two production Hyper-V hosts. That job is above my pay grade. :)

    Time to go melt in the sun!

    • Marked as answer by MSS_AxiomEPM Friday, August 17, 2012 8:24 PM
    Friday, August 17, 2012 8:24 PM