none
Failover Cluster Network - MPIO

    Question

  • Each of my Hyper-V hosts has 2 NICs configured for iSCSI using MPIO.

    I have set the MPIO policy as Round Robin with Subset.

    When I check the Failover Cluster Network settings (in the Failover Cluster Manager) for the iSCSI network I can only see one iSCSI NIC on each host.

    a.) Is this by design i.e. because I am using MPIO and Round Robin? (I have changed to Least Queue Depth etc. and still only see one iSCSI NIC in the Failover Cluster Manager.

    b.) Is there a recommended protocol to use i.e. Round Robin vs Least Queue Depth vs Weighted etc.

    c.) Any recommended tools to test MPIO is actually working? I verified by checking the sessions and paths in the iSCSI Initiator and the disks themselves in Device Manage->MPIO (some show as unoptimized but again, I believe this is by design)

    Sunday, September 01, 2013 6:26 PM

Answers

  • Each of my Hyper-V hosts has 2 NICs configured for iSCSI using MPIO.

    I have set the MPIO policy as Round Robin with Subset.

    When I check the Failover Cluster Network settings (in the Failover Cluster Manager) for the iSCSI network I can only see one iSCSI NIC on each host.

    a.) Is this by design i.e. because I am using MPIO and Round Robin? (I have changed to Least Queue Depth etc. and still only see one iSCSI NIC in the Failover Cluster Manager.

    b.) Is there a recommended protocol to use i.e. Round Robin vs Least Queue Depth vs Weighted etc.

    c.) Any recommended tools to test MPIO is actually working? I verified by checking the sessions and paths in the iSCSI Initiator and the disks themselves in Device Manage->MPIO (some show as unoptimized but again, I believe this is by design)

    a) Yes but please provide a screenshots for both config pages to be sure.

    b) It depends. If your iSCSI does support true Active-Active (for example StarWind) then Round Robin is the way to go. If it does Active-Passive (for example Dell Equallogic) then you'd better use Fail Over Only. For some implementations doing Active-Active but with ALUA (Asymmetric Logical Unit Access, for example EMC VNX from the decent ones coming to mind) either Fail Over Only or Weighted Path or RR is a way to go. Check what vendor recommends. Also make sure to install vendor-provided MPIO stack if there's any (for example PowerPath for mentioned EMC) as it will have better performance compared to MS provided one for sure (for a reason: own implementation does know which node has data with read so you skip one possible inter-node communication hop if MPIO stack asked node A to read the content actually being stored on a node B from a built RAIN, good example here is HP/LeftHand DSM implementation).

    c) That's easy. You select Fail Over Policy and run Intel I/O Meter against the test LUN (make a fast one as if you'd not be able to saturate the bandwidth of an uplinks you'll see the same performance numbers all the time so if dealing with say StarWind make sure you ENABLE cache and use In-Memory options to keep the data in RAM if possible, for EQL or VNX you make some test LUN configured from as many spindles as you can and select some stripe option for RAID to build a virtual LUN: RAID10, RAID0 etc). Run I/O Meter doing say 4 threads and 16 I/Os in queue with 64KB reads and writes (two separate test, don't mix!) and doing sequential I/O of course. One say GbE connection will give you ~100 MB/sec for reads and writes. Adding second path (and configuring Round Robin) should nearly double these numbers. If you have the same performance then either your LUN is too slow so cannot make more then 100 MB/sec (very difficult to achieve combining even  a pair of 10K rpm SAS drives) or you don't have Round Robin and Active-Active actually working. 

    Hope this helped :)


    StarWind iSCSI SAN & NAS

    • Marked as answer by PICfuj Monday, September 02, 2013 11:42 AM
    Sunday, September 01, 2013 9:13 PM