none
SCVMM 2008 R2 - Host Refresh causes Event ID 113 MSiSCSI events on Hyper-V Cluster

    Question

  • Hello TechNet Community,

    Yesterday I installed SCVMM 2008 R2 into a Windows Server 2008 R2 VM which runs on top of a Windows Server 2008 R2 Failover Cluster (Dell supported configuration). This physical two node cluster is configured for high availability and uses fibre channel and iSCSI storage. Both nodes were built together and are identical in their hardware, software and build sequence. The nodes use separate network adapters for management, hyper-v, cluster shared volumes, live migrations, and iSCSI. I'm using the built-in Microsoft iSCSI Initiator to access an iSCSI LUN configured as a Cluster Shared Volume.

    Since installing SCVMM, I'm seeing a set of Event ID 113 MSiSCSI events on both hosts at the same time every 30 minutes. These events are logged for every path to the iSCSI RAID controllers. The events look as follows:

    Log Name:      System
    Source:        MSiSCSI
    Date:          22/09/2010 3:08:52 PM
    Event ID:      113
    Task Category: None
    Level:         Warning
    Keywords:      Classic
    User:          N/A
    Computer:      NGFVH1.xyz.local
    Description:
    iSCSI discovery via SendTargets failed with error code 0xefff0024 to target portal *192.168.130.101 0003260 B06BDRV\L4SC&PCI_163914E4&SUBSYS_02D31028&REV_20\5&31271725&0&30050300_0 .

    I've believe these occur during the SCVMM host refresh cycle (every 30 mins?) and I'm able to reproduce the errors by manually refreshing the hosts using the SCVMM console.

    The hosts and the SCVMM VM are patched with all current Microsoft critical and security updates, and SCVMM has patches KB2308590 and KB982523 applied. The cluster hosts and storage are running the latest Dell firmware/drivers.

    Can anyone please assist with this issue?

    Kind Regards,

    David Barrett

    Wednesday, September 22, 2010 6:12 AM

Answers

All replies

  • Hi.

    I beilive this is related to the target.

    Are you having any other error messages relatet to iscsi ? (iscsiprt 7, 24, 30 ? )

     


    Kristian
    Wednesday, September 22, 2010 6:24 AM
    Moderator
  • Hi Kristian,

    I also thought iSCSI target at first, however, the Event ID 113 MSiSCSI events only started immediately after adding the hosts into SCVMM and have occurred every half hour since. I'm not seeing any other errors/warnings in the event logs on the hosts.

    I've also looked at the iSCSI SAN and the counters on the two Ethernet switches which are dedicated to iSCSI communication for this cluster and I'm not seeing any errors.

    One detail I left out of my original post was Host 2 experienced a temporary FC and iSCSI disconnect while being added to SCVMM. This didn't occur on Host 1 and it hasn't occurred since. The events logged were as follows:

    Log Name:      System
    Source:        Microsoft-Windows-FailoverClustering
    Date:          21/09/2010 5:51:05 PM
    Event ID:      5120
    Task Category: Cluster Shared Volume
    Level:         Error
    Keywords:     
    User:          SYSTEM
    Computer:      NGFVH2.xyz.local
    Description:
    Cluster Shared Volume 'Volume2' ('HYPER-V_FC') is no longer available on this node because of 'STATUS_CONNECTION_DISCONNECTED(c000020c)'. All I/O will temporarily be queued until a path to the volume is reestablished.

    Log Name:      System
    Source:        Microsoft-Windows-FailoverClustering
    Date:          21/09/2010 5:51:05 PM
    Event ID:      5120
    Task Category: Cluster Shared Volume
    Level:         Error
    Keywords:     
    User:          SYSTEM
    Computer:      NGFVH2.xyz.local
    Description:
    Cluster Shared Volume 'Volume3' ('HYPER-V_ISCSI') is no longer available on this node because of 'STATUS_CONNECTION_DISCONNECTED(c000020c)'. All I/O will temporarily be queued until a path to the volume is reestablished.

    I'm thinking these disconnects may have been a glitch and are not necessarily related to the Event ID 113 which occurs during SCVMM host refresh.

    Regards,

    David Barrett

    Wednesday, September 22, 2010 8:44 AM
  • This post is older than 30 days or we have not heard back from you.  Did this issue get resolved?  If so, please share with community how you resolved.  Otherwise, re-activate post if you still require assistance.


    Carmen M. Summers (MSFT) --Posting is provided "AS IS" with no warranties, and confers no rights.
    Thursday, October 21, 2010 8:12 PM
  • I'd like to bump this thread.  I'm seeing the exact same behavior.  This is my configuration:

    • 2 Hyper-V R2 (Enterprise Edition) hosts on Dell PE R610 servers;
    • Dell MD3000i iSCSI SAN (active/active configuration with redundant iSCSI controllers);
    • Broadcom NetXtreme II NIC's for iSCSI connectivity (no teaming and hardware TOE/iSCSI disabled).

    We are not noticing any performance issues but these event log entries are occurring every time a host refresh occurs in VMM 2008 R2.

    VMM 2008 R2 build is 2.0.4275.0 (agents and server).

    Thanks,

    Kevin Powell

     

    Friday, November 05, 2010 6:36 PM
  • I am having this exact same issue.

     

    4 node cluster server 2008r2

    Nseries iscsi san

    Tuesday, February 08, 2011 5:35 PM
  • I am experiencing this same issue.

        SCVMM 2008R2

        3 node Server2008R2

        Poweredge r510, EqualLogic PS6500X

     

    Wednesday, February 09, 2011 3:00 PM
  • Same Issue as well.

     

    • SCVMM 2008 R2
    • 2 nodes Dell R710
    • Broadcom NetXtreme II NIC's for iSCSI connectivity
    • Netapp 2040 iscsi SAN (two controllers active/active)
    I have an open case with microsoft but right now they have no answers or solutions.


    Wednesday, February 09, 2011 7:25 PM
  • Ditto

    4 node cluster (Dell R610s) accessing a MD3000i SAN. Its a 2008 R2 Failover cluster.  SCVMM 2008 R2 is running on a separate physical server.

    There don't appear to be any performance issues, but a System event log on each host filled with event id 113s is a pain....

    Thursday, February 10, 2011 12:55 PM
  • Hi... I solved my issue! Tomorrow I will share my info with all of you.

    My solutions is involving broadcom nextreme II nics.. I don't know if will be valid for any other manufacturers.

     

    Bye

     

    Thursday, February 10, 2011 6:03 PM
  • Well?  We'd all like to hear what you found!
    Monday, February 14, 2011 7:58 PM
  • Hi... I solved my issue! Tomorrow I will share my info with all of you.

    My solutions is involving broadcom nextreme II nics.. I don't know if will be valid for any other manufacturers.

     

    Bye

     

    BUMP, I am having this issue too! -Bryan
    Tuesday, February 15, 2011 5:11 PM
  • Hi. I'm having the same issue and am using Broadcom NICs for iSCSI. I would apreciate if you could post your fix for this issue.

    Thanks.

    Sunday, February 20, 2011 11:26 PM
  • Same here, your fix would be much appreciated
    Monday, February 21, 2011 5:26 PM
  • I've got excatly the same problem.

    • Hyper-V R2
    • 3 Node Cluster
    • SCVMM R2
    • NetApp iSCSI SAN

    It would be greate if you could tell how you solved this problem!

     

    Thanks in advance!

    Wednesday, February 23, 2011 12:25 PM
  • We were having the same issue with the Broadcom 57711 10GB nics.  This was solved by appying the latest NIC drivers and firmware (link below). And also applying the latest VMM update that was released a few weeks ago. This issue occurs when using the NIC as an HBA in iSCSI networks. VMM runs a health check or update every 30 minutes however this check involves a ping.  When using a NIC as an HBA the Operating system will not pick it up as a NIC and thus no ping is possible on the iSCSI network.  Hope this helps

    Windows 2003/2008 x64 driver update for NetXtreme I/II Broadcom Ethernet adapters.

    Release Date: 3/10/2011

    Version: 16.2.0, A01

    http://support.dell.com/support/downloads/format.aspx?releaseid=R297264

    Virtual Machine Manager Update KB2492980 was picked up by visiting Microsoft Update.

    Wednesday, March 23, 2011 12:02 PM
  • I was getting these alerts once every 30 minutes.

    I was able to make the problem go away if I disabled the 'Virtual Machine Manager Agent'.

    SCVMM 2008 R2 SP1 is now available for download and I will be applying it to see if it resolves the issue.  I am not applying it immediately as I am hosting 2 Hyper-V clusters (DEV/QA with Windows 2008 R2 SP1 and PROD without Windows 2008 R2 SP1) and want to ensure that the SCVMM SP1 can support both environments concurrently with out any side-effects.

     

    Thursday, March 31, 2011 4:11 PM
  • I was also receiving these messages every 30 minutes and the Virtual Machine Manager service was the culprit.  Disabling the service stopped the errors.  SCVMM SP1 did not resolve the issue.

    As it turns out the drivers for Broadcom NICs in my four servers turned out to be the culprits.  When you install the drivers supplied by Dell for these NICs they assume incorrectly that you want iSCSI support and they install an iSCSI driver.  You can see these in the device manager under "Storage Controllers as Broadcom BMC5709C NetXtreme II GigE iSCSI Adapter #?".

    When SCVMM polls the computer it polls all the storage controllers and fails on the storage controllers not connected to your iSCSI storage.

    The iSCSI support can be disabled through the Broadcom Advanced Control Suite for any NICs not connected to your iSCSI storage device.

    1. Select the host NIC
    2. Expand Resource Reservations
    3. Uncheck the iSCSI Offload Engine
    4. Click Apply.

    Once I disabled all the unwanted iSCSI offload engines the errors stopped appearing in my system event log.  Life is good again!

    • Proposed as answer by alegr1 Friday, July 06, 2012 12:31 AM
    Thursday, May 26, 2011 1:39 PM
  • The problem is that, even though not all iSCSI HBA instances can actually reach the target in question, the user had set up the Discovery Portal to issue iSCSI "Send Targets" along all possible iSCSI HBAs + the MSFT SW initator.  This is the default behavior if all one does is specify the specific initiator.

    To properly configure discovery, do the following (assumes a fresh environment):

    1. Open the iSCSI Initiator GUI
    2. Select the Discovery Tab
    3. Click "Discovery Portal..." button to open the Discovery Target Portal dialog
    4. Enter the IP address (optionally TCP Port number) of the target's iSCSI portal
    5. Click "Advanced..." button to open the Advanced Settings dialog
    6. On the "Local Adapter:" pulldown, select a specific HBA instance you *know* can actually connect to the target.  Hint:  By inspecting the list of IPs for this HBA instance (see 7 below), one can gain this knowledge
    7. On the "Initiator IP:" pulldown, select the local address from which this HBA should connect from
    8. Click OK to close the Advanced Settings dialog
    9. Click OK to save your changes
    10. Repeat from (3) for all Initiator - Target combinations

    Tim

    • Proposed as answer by bellacotim Friday, July 06, 2012 12:59 AM
    Friday, July 06, 2012 12:59 AM
  • To properly configure discovery, do the following (assumes a fresh environment):

    1. Open the iSCSI Initiator GUI
    2. Select the Discovery Tab
    3. Click "Discovery Portal..." button to open the Discovery Target Portal dialog
    4. Enter the IP address (optionally TCP Port number) of the target's iSCSI portal
    5. Click "Advanced..." button to open the Advanced Settings dialog
    6. On the "Local Adapter:" pulldown, select a specific HBA instance you *know* can actually connect to the target.  Hint:  By inspecting the list of IPs for this HBA instance (see 7 below), one can gain this knowledge
    7. On the "Initiator IP:" pulldown, select the local address from which this HBA should connect from
    8. Click OK to close the Advanced Settings dialog
    9. Click OK to save your changes
    10. Repeat from (3) for all Initiator - Target combinations

    Tim

    This worked for me in my environment. R610 2008r2, MD3620i, and SCVMM 2012.  I was getting the same 113 warning on a few of my boxes every 30 min.  I used these steps on one of my boxes and that box stopped generating this error. 

    It was strange because I have a different box in the same cluster that is not presenting these same warnings.  If I manually refresh the cluster from VMM, it would generate those events.  If I turned of the VMM server for an hour, they would stop.

    I hope you don't mind, but I used your solution in a blog post about this issue and linked it back here. iScsi SendTarget issues with MD3620i and VMM

    Thursday, August 09, 2012 2:40 PM
  • Sorry for necro'ing an old thread, but this is good stuff!!!

    For anyone that was receiving the 0xEFFF0024 error, were you using DPM for backups? Would this error affect backup performance??? That's the most important question. While I'm not using Hyper-V clustering, I have 3 Hyper-V hosts. 2 of them have really poor performance when backing up Hyper-V VMs that reside on our EQL 6100 SAN. These machines both have Broadcom 5709 quad-port NICS installed. They have the latest drivers from Broadcom installed, but the initially installed drivers came from Dell. I really REALLY want to try these solutions, but since the two problem machines are in a production environment, I have to be cautious.

    Thursday, August 09, 2012 8:02 PM
  • Sorry for necro'ing an old thread, but this is good stuff!!!

    For anyone that was receiving the 0xEFFF0024 error, were you using DPM for backups? Would this error affect backup performance??? That's the most important question. While I'm not using Hyper-V clustering, I have 3 Hyper-V hosts. 2 of them have really poor performance when backing up Hyper-V VMs that reside on our EQL 6100 SAN. These machines both have Broadcom 5709 quad-port NICS installed. They have the latest drivers from Broadcom installed, but the initially installed drivers came from Dell. I really REALLY want to try these solutions, but since the two problem machines are in a production environment, I have to be cautious.

    The 0xEFFF0024 error can be a symptom of performance issues, but I don't think the error indicates performance issues when its by its self. I have seen my hosts throw that error when my guests are having iSCSI issues.  I have one file server that is virtual, but its data volume is an iSCSI target.  When my backup create a Volume Shadow Copy snapshot, the iSCSI disk can sometimes stall for 60 seconds and drop a lot of MSiSCSI 9 errors in the event log.  I could reproduce the issue by manually taking VSS snapshots.

    I reduced the impact by lowering the disk timeout to 30 sec from 60 so it would recover sooner. I also set my MPIO to use LQD instead of RR. (RR is what DELL documentation suggested for my MD3620i).  After that, my manual VSS snapshots stopped stalling the system.  But my backup software still causes the issues.  They do better when the system is not under load, so for now I am back to nightly backups while I work out the issue.

    I mention VSS because I know DPM (and most other backup software) depend on that and it gives you another area you can troubleshoot.

    Tuesday, August 14, 2012 5:07 AM