none
Server 2016 S2D - VirtualDisk "detached" - Cannot get connected by all means although pool resources are all healthy...

    Question

  • Hi,

    I am currently having a serious problem on a 4-Node S2D cluster... after rebooting the nodes one by one for patching, one of the five volumes (4 production volumes + 1 for VMFLeet) failed to come online again and is now displayed as "detached" with HealthState "Unknown". When trying to bring it online or connect it via PS command an error occurs stating that "access is denied"... Further info as follows:

    Get-VirtualDisk | Where-Object -Filter { $_.OperationalStatus -eq "Detached" }
    FriendlyName ResiliencySettingName OperationalStatus HealthStatus IsManualAttach  Size
    ------------ --------------------- ----------------- ------------ --------------  ----
    TOM-N2       Mirror                Detached          Unknown      True           25 TB


    Get-StorageSubSystem clu* | Debug-StorageSubSystem
    Severity: Critical

    Reason         : The volume 'TOM-N2' is inaccessible. Some data may be lost.
    Recommendation : Check the physical and/or network connectivity of all storage devices. You may need to restore from backup.
    Location       : Not available
    Description    : Virtual disk 'TOM-N2'


    Get-VirtualDisk | Where-Object -Filter { $_.OperationalStatus -eq "Detached" } | Connect-VirtualDisk
    Connect-VirtualDisk : Access denied

    Extended information:
    Access is denied.

    Recommended Actions:
    - Check if you have the necessary privileges to perform the operation.
    - Perform the operation from Failover Cluster Manager if the resource is clustered.

    Activity ID: {94a81a09-e8c6-40ca-886f-66cbeb2db536}
    At line:1 char:82
    + ... -Filter { $_.OperationalStatus -eq "Detached" } | Connect-VirtualDisk
    +                                                       ~~~~~~~~~~~~~~~~~~~
        + CategoryInfo          : PermissionDenied: (StorageWMI:ROOT/Microsoft/...SFT_VirtualDisk) [Connect-VirtualDisk], CimException
        + FullyQualifiedErrorId : StorageWMI 40001,Connect-VirtualDisk

    VirtualDisk properties as follows:

    Get-VirtualDisk | Where-Object -Filter { $_.OperationalStatus -eq "Detached" } |fl

    ObjectId                          : {1}\\TOM\root/Microsoft/Windows/Storage/Providers_v2\SPACES_VirtualDisk.ObjectId="{
                                        815d0f59-a7b9-4dc9-8db1-52c49b96a20f}:VD:{c1b20148-606d-4ffe-9bf3-cade9bb87d76}{12a
                                        6f113-05a5-42f7-85c7-7992ea7b6061}"
    PassThroughClass                  : 
    PassThroughIds                    : 
    PassThroughNamespace              : 
    PassThroughServer                 : 
    UniqueId                          : 13F1A612A505F74285C77992EA7B6061
    Access                            : Read/Write
    AllocatedSize                     : 27487790694400
    AllocationUnitSize                : 1073741824
    ColumnIsolation                   : PhysicalDisk
    DetachedReason                    : By Policy
    FaultDomainAwareness              : StorageScaleUnit
    FootprintOnPool                   : 54989003161600
    FriendlyName                      : TOM-N2
    HealthStatus                      : Unknown
    Interleave                        : 262144
    IsDeduplicationEnabled            : False
    IsEnclosureAware                  : False
    IsManualAttach                    : True
    IsSnapshot                        : False
    IsTiered                          : False
    LogicalSectorSize                 : 4096
    MediaType                         : Unspecified
    Name                              : 
    NameFormat                        : 
    NumberOfAvailableCopies           : 
    NumberOfColumns                   : 8
    NumberOfDataCopies                : 2
    NumberOfGroups                    : 1
    OperationalStatus                 : Detached
    OtherOperationalStatusDescription : 
    OtherUsageDescription             : 
    ParityLayout                      : 
    PhysicalDiskRedundancy            : 1
    PhysicalSectorSize                : 4096
    ProvisioningType                  : Fixed
    ReadCacheSize                     : 0
    RequestNoSinglePointOfFailure     : False
    ResiliencySettingName             : Mirror
    Size                              : 27487790694400
    UniqueIdFormat                    : Vendor Specific
    UniqueIdFormatDescription         : 
    Usage                             : Other
    WriteCacheSize                    : 0
    PSComputerName                    : 

    Does sb. have a hint what could help to remedy the situation?! As already mentioned all storage devices and node connectivity is doing fine...

    Thanks in advance,

    NG

    


    • Edited by llangolas Monday, March 19, 2018 5:52 PM
    Monday, March 19, 2018 4:14 PM

All replies

  • Hi NG,

    Can you confirm that the storage pool is not set to read-only mode? Whether you are running Powershell "as administrator"? Please run the commands right click as administrator even you already log on the systems as the Admin account.

    And based on the experience, if there is something wrong with the physical disks that are attached on the server, the VD could be failed to connect.

    If you run get-physicaldisk commands, all the disks shows the health status?

    Please also check the event viewer, any more warning/error?

    If all shows physical disk shows the health, as a workaround, maybe delete this virtual disk and recovery the data from the backup.

    https://docs.microsoft.com/en-us/windows-server/failover-clustering/health-service-faults#faulttype-microsofthealthfaulttypevirtualdisksdetached

    Best Regards,

    Mary


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    Tuesday, March 20, 2018 6:17 AM
    Moderator
  • Hi Mary,

    yes - I can confirm that I am running PS as Administrator and that the pool is definitely not set to read-only. Furthermore all disks are healthy as already mentioned in my original post. Of course I know that I could consider to trash the 25TB volume and recreate it. However not all machines stored on the volume are in backup, since it is used amongst others by QA with their testing environments. No need to mention that we also need to build up trust in this solution... as we want to recommend it to our customers as well. Having to trash and rebuild volumes from time to time (without knowing why) is therefore not an option - even though the cited error message of the built-in health service "FaultType: Microsoft.Health.FaultType.VirtualDisks.Detached" seems to primarily suggest this.

    Regards,

    NG

    Tuesday, March 20, 2018 8:20 AM
  • Hi,

    In your scenario, according to your descriptions, physical disks and network connection all shows fine. And the only error point to this part.  I'm afraid you might need to contact  Microsoft Customer Support Services (CSS) so that a dedicated Support Professional can help you on this issue. Further troubleshooting about this may out of the forum support.

    Appreciate your support and understanding.

    To obtain the phone numbers for specific technology request, please refer to the website listed below:

    https://www.microsoft.com/en-us/worldwide.aspx

    Best Regards,

    Mary


    Please remember to mark the replies as answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.

    Wednesday, March 21, 2018 2:04 AM
    Moderator
  • I'm curious if you were ever able to resolve your problem? I'm having the same problem as of this morning and would appreciate any help.

    Thanks,

    Josh

    Thursday, June 07, 2018 1:37 PM
  • I'm curious if you were ever able to resolve your problem? I'm having the same problem as of this morning and would appreciate any help.

    Thanks,

    Josh


    Hi Josh did you find a fix for the issue ?
    Wednesday, June 13, 2018 3:59 AM
  • we had similar issue, 4 x s2d nodes in cluster (running 3 way mirror)

    after doing win update on the first node and rebooting... the virtualdisk went offline, the node that was rebooted had no VPS running on it and wasnt the virtual disk owner.

    This helped us to fix the issue, was a extended downtime but things came online.

    https://jtpedersen.com/2016/11/troubleshooting-failed-virtualdisk-on-a-storage-spaces-direct-cluster/

    seems they is an issue with rebooting nodes in a s2d cluster, MS needs a more full proof way to take nodes offline thats in a s2d cluster, cause this issue is a major nightmare!!

    Wednesday, June 13, 2018 5:01 AM
  • We opened a case with Microsoft. The solution was to remove all of the disks in that pool from Cluster Shared Volumes. They then appeared as Available Storage in Failover Cluster Manger. We then deleted all of them from Available Storage. Then in Server Manager we had remove all of the Virtual Disks that belonged to that pool. Then select add virtual disk and add the disks back into the pool. This allowed us to re-attach the disk, but it stilled showed errors. We were able to copy the contents of the disk to a new virtual disk and add that new virtual disk back to the cluster shared volumes. 

    We tried adding the original disk back to cluster shared volumes, but resulted in the disk being detached again and unable to re-attach without going through the same process again. So we are currently running fine, but plan on adding additional storage the cluster so that we can move everything to the new storage and rebuild the old storage from scratch.

    Thanks,

    Josh

    Wednesday, June 13, 2018 3:00 PM