none
Storage Pool: Transient Error RRS feed

  • Question

  • Hello Guys,

    I'm in big trouble with my Storage Spaces storage pool. I'm running a tiered setup with four SSD's and twelve HDD's. The server was running about two weeks without issues. Today I restarted the server and when it came up again my whole SSD Tier has a Transient Error. I can't do anything because my StoragePool's OperationalStatus in read only. Which means I can not remove or reitre the disks. Neither can I ran Optimize-StoragePool or anything else. All I have done is a normal reboot as several times before and everything seems to be destroyed. I've also tried reseating the disks and a couple of reboots nothing seems to help.

    I'm running Windows Server 2019. See the configuration below:

    [File01]: PS C:\Users\Administrator\Documents> get-disk
    
    Number Friendly Name                                                                      Serial Number                    HealthStatus         OperationalStatus      Total Size Partition
                                                                                                                                                                                      Style
    ------ -------------                                                                      -------------                    ------------         -----------------      ---------- ----------
    1      ATA Samsung SSD 860                                                                S4CZNF0M458423B                  Healthy              Online                  931.51 GB RAW
    5      ATA Samsung SSD 860                                                                S4CZNF0M465197M                  Healthy              Online                  931.51 GB RAW
    9      ATA Samsung SSD 860                                                                S4CZNF0M465155K                  Healthy              Online                  931.51 GB RAW
    13     ATA Samsung SSD 860                                                                S4CZNF0M458560H                  Healthy              Online                  931.51 GB RAW
    0      Samsung SSD 840 EVO 120GB                                                          S1D5NSAF203917B                  Healthy              Online                  111.79 GB GPT
    
    
    [File01]: PS C:\Users\Administrator\Documents> get-physicaldisk
    
    DeviceId FriendlyName              SerialNumber    MediaType CanPool OperationalStatus     HealthStatus Usage            Size
    -------- ------------              ------------    --------- ------- -----------------     ------------ -----            ----
    11       ATA TOSHIBA DT01ACA3      535T1SYYS       HDD       False   OK                    Healthy      Auto-Select   2.73 TB
    8        ATA WDC WD100EMAZ-00      JEK2JYYZ        HDD       False   OK                    Healthy      Auto-Select    9.1 TB
    5        ATA Samsung SSD 860       S4CZNF0M465197M SSD       False   {Transient Error, OK} Unhealthy    Auto-Select 931.51 GB
    4        ATA WDC WD100EMAZ-00      JEHLDU2X        HDD       False   OK                    Healthy      Auto-Select    9.1 TB
    10       ATA TOSHIBA DT01ACA3      18F94WVAS       HDD       False   OK                    Healthy      Auto-Select   2.73 TB
    14       ATA ST3000DM001-1CH1      Z1F4QWY4        HDD       False   OK                    Healthy      Auto-Select   2.73 TB
    2        ATA ST3000DM001-1CH1      Z1F2Y816        HDD       False   OK                    Healthy      Auto-Select   2.73 TB
    0        Samsung SSD 840 EVO 120GB S1D5NSAF203917B SSD       False   OK                    Healthy      Auto-Select 111.79 GB
    7        ATA TOSHIBA DT01ACA3      X384UVNGS       HDD       False   OK                    Healthy      Auto-Select   2.73 TB
    1        ATA Samsung SSD 860       S4CZNF0M458423B SSD       False   {Transient Error, OK} Unhealthy    Auto-Select 931.51 GB
    6        ATA TOSHIBA DT01ACA3      18F93URAS       HDD       False   OK                    Healthy      Auto-Select   2.73 TB
    16       ATA WDC WD100EMAZ-00      JEHR7YZZ        HDD       False   OK                    Healthy      Auto-Select    9.1 TB
    13       ATA Samsung SSD 860       S4CZNF0M458560H SSD       False   {Transient Error, OK} Unhealthy    Auto-Select 931.51 GB
    12       ATA WDC WD100EMAZ-00      JEK15YAZ        HDD       False   OK                    Healthy      Auto-Select    9.1 TB
    3        ATA Hitachi HDS5C303      MCM9215Q08RSLX  HDD       False   OK                    Healthy      Auto-Select   2.73 TB
    15       ATA ST3000DM001-1CH1      Z1F4RNEX        HDD       False   OK                    Healthy      Auto-Select   2.73 TB
    9        ATA Samsung SSD 860       S4CZNF0M465155K SSD       False   {Transient Error, OK} Unhealthy    Auto-Select 931.51 GB
    
    
    [File01]: PS C:\Users\Administrator\Documents> get-storageenclosure
    
    FriendlyName SerialNumber OperationalStatus HealthStatus NumberOfSlots ElementTypesInError
    ------------ ------------ ----------------- ------------ ------------- -------------------
    LSI SAS2X36               OK                Healthy      24
    
    
    [File01]: PS C:\Users\Administrator\Documents> get-virtualdisk
    
    FriendlyName ResiliencySettingName FaultDomainRedundancy OperationalStatus HealthStatus     Size FootprintOnPool StorageEfficiency
    ------------ --------------------- --------------------- ----------------- ------------     ---- --------------- -----------------
    vDisk01                                                  Detached          Unknown      45.46 TB        61.83 TB           73,53 %
    
    
    [File01]: PS C:\Users\Administrator\Documents> get-storagepool
    
    FriendlyName OperationalStatus HealthStatus IsPrimordial IsReadOnly     Size AllocatedSize
    ------------ ----------------- ------------ ------------ ----------     ---- -------------
    Primordial   OK                Healthy      True         False      61.96 TB      58.21 TB
    Pool01       Read-only         Unhealthy    False        False      61.85 TB      61.83 TB
    
    
    [File01]: PS C:\Users\Administrator\Documents> Get-StorageSubSystem
    
    FriendlyName              HealthStatus OperationalStatus
    ------------              ------------ -----------------
    Windows Storage on File01 Healthy      OK
    
    
    [File01]: PS C:\Users\Administrator\Documents> get-storagejob
    [File01]: PS C:\Users\Administrator\Documents> get-storagetier
    
    FriendlyName      TierClass   MediaType ResiliencySettingName FaultDomainRedundancy     Size FootprintOnPool StorageEfficiency
    ------------      ---------   --------- --------------------- ---------------------     ---- --------------- -----------------
    vDisk01-HDDTier01 Capacity    HDD       Parity                1                     43.65 TB         58.2 TB           75,00 %
    vDisk01-SSDTier01 Performance SSD       Mirror                1                      1.81 TB         3.62 TB           50,00 %
    SSDTier01         Unknown     SSD       Mirror                1                         0 B             0 B
    HDDTier01         Unknown     HDD       Parity                1                         0 B             0 B
    
    

    It's really weird that the disks show up at Get-Disk. It's seems like StoragePool isn't looking for the correct disks. I don't understand why this can happen from a simple reboot. There were no hardware changes at all.

    Any help would be appreciated.

    Monday, September 2, 2019 12:35 PM

All replies

  • Hi,

    Thanks for your question.

    Transient error means there was a temporary error with the drive. This usually means the drive was unresponsive, but it could also mean that the Storage Spaces protective partition was inappropriately removed from the drive.

    Please follow up the steps to see if it works.

    1. If the drive doesn't transition back to the OK state, you can try using the Reset-PhysicalDisk cmdlet to wipe the drive.

    2. Use Repair-VirtualDisk to restore the resiliency of affected virtual disks.

    3. If this keeps happening, replace the drive, or try getting detailed diagnostic info about this drive by following the steps in Troubleshooting using Windows Error Reporting > Physical disk failed to come online.

    Detailed reference link:

    https://docs.microsoft.com/en-us/windows-server/storage/storage-spaces/storage-spaces-states

    Hope above information can help you.

    Highly appreciate your effort and time. If you have any question or concern, please feel free to let me know.

    Best regards,

    Michael


    Please remember to mark the replies as an answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com

    Tuesday, September 3, 2019 8:12 AM
    Moderator
  • Hi,

    Just checking in to see if the information provided was helpful. Please let us know if you would like further assistance.

    Best Regards,

    Michael


    Please remember to mark the replies as an answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com

    Friday, September 6, 2019 9:33 AM
    Moderator
  • Hi,

    How are things going on? Was your issue resolved?

    Please feel free to let me know if you need further assistance.

    Best regards,

    Michael


    Please remember to mark the replies as an answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com


    Wednesday, September 18, 2019 8:50 AM
    Moderator
  • Sadly your answer did not help me. Because all disks of a tier were down, running Reset-PhysicalDisk would have just wiped that whole tier. I couldn't run Repair-VirtualDisk because of missing resiliency the virtual disk wouldn't come online anymore. I ended up removing the virtual disk and creating a new parity virtual disk without tiered storage. I suspect that ReFS caused this issue, as many other users have the exact same issue that ReFS formatted disks just come up as RAW after a reboot.

    With the new single tier parity virtual disk I have now a different issue, the same as described here:

    https://social.technet.microsoft.com/Forums/en-US/bc6a89bb-bd7f-4230-b4f2-af8a1ddb0697/volume-goes-offline-error-storagespacesdrivereventid-312?forum=winserverfiles

    As soon as there are too many concurrent writes on that disks it's pulled offline. The whole server becomes inresponsible when that happens. I can't really find any solution to this on the web.

    Friday, September 20, 2019 7:55 AM
  • Hi,

    Sorry for my delay.


    For now, I still couldn't find any clue, I'm afraid you might need to contact  Microsoft Customer Support Services (CSS) so that a dedicated Support Professional can help you on this issue.

     

    To obtain the phone numbers for specific technology request, please refer to the website listed below:

     

    https://www.microsoft.com/en-us/worldwide.aspx

    Appreciate your support and understanding.

    Best regards,

    Michael


    Please remember to mark the replies as an answers if they help.
    If you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com

    Thursday, September 26, 2019 10:04 AM
    Moderator