locked
Troubleshoot Storage Spaces parity pool "offline due to critical write failures" RRS feed

  • Question

  • NOTE: The following is a reposting from the home user forum as suggested from a Microsoft employee (https://answers.microsoft.com/en-us/windows/forum/windows_10-files/troubleshoot-storage-spaces-parity-pool-offline/d78a2b02-7230-4d16-8d43-a213320aafcc). I post this here in hopes that the community has suggestions for a solution can be presented to others that indicated they have had the same issue. I never found a solution myself and ended up deleting the storage space and starting over from a backup. 

    My parity storage space of 4 drives went offline. I'm having difficulty troubleshooting the issue and would appreciate any help.

    - The Manage Storage Spaces gui just says: "Offline due to critical write failures; add drives". All drives show "OK" in the gui.
    - Get-PhysicalDisk shows all drives are "Healthy" and "OK" OperationStatus
    - HDD Manufacturer long tests (12+ hours) on each drive succeed without any errors
    - If I click the "Bring online" link in the gui, it hangs for minutes and silently fails, other than a single log in the Event Viewer:
    - I have not found anything in Event Viewer other than one message about the pool whenever I try to bring it online:

    Source: StorageSpaces-Driver
    Event ID: 312
    Virtual disk {353523d0-ee91-11e2-be7c-bc5ff4b0cdb7} has failed a write operation to all its copies.                  
                      
    You can online the disk by using the following commands in PowerShell:                  
                      
    Get-VirtualDisk | ?{ $_.ObjectId -Match "{353523d0-ee91-11e2-be7c-bc5ff4b0cdb7}" } | Get-Disk | Set-Disk -IsReadOnly $false 
    Get-VirtualDisk | ?{ $_.ObjectId -Match "{353523d0-ee91-11e2-be7c-bc5ff4b0cdb7}" } | Get-Disk | Set-Disk -IsOffline  $false

    - When I run the above commands, the storage space comes online successfully, but in read only mode. Upon the next reboot it is back in the same offline state.

     After running the above Get-VirtualDisk... commands, I ran chkdsk without any params and found a problem below:
    PS C:\WINDOWS\system32> chkdsk d:
    The type of the file system is NTFS.
    The volume is in use by another process. Chkdsk
    might report errors when no corruption is present.
    Volume label is ParityDrive.
    WARNING!  /F parameter not specified.
    Running CHKDSK in read-only mode.
    Stage 1: Examining basic file system structure ...
      2348032 file records processed.
    File verification completed.
      191 large file records processed.
      0 bad file records processed.
    Stage 2: Examining file name linkage ...
      2941 reparse records processed.
      2720812 index entries processed.
    Index verification completed.
      0 unindexed files scanned.
      0 unindexed files recovered to lost and found.
      2941 reparse records processed.
    Stage 3: Examining security descriptors ...
    Security descriptor verification completed.
      186391 data files processed.
    CHKDSK is verifying Usn Journal...
    Usn Journal verification completed.
    The Volume Bitmap is incorrect.
    Windows has checked the file system and found problems.
    Please run chkdsk /scan to find the problems and queue them for repair.
      10485630 MB total disk space.
       5763744 MB in 2128469 files.
        437248 KB in 186392 indexes.
             0 KB in bad sectors.
       9825492 KB in use by the system.
         65536 KB occupied by the log file.
       4711864 MB available on disk.
          4096 bytes in each allocation unit.
    2684321530 total allocation units on disk.
    1206237262 allocation units available on disk.

    Then I tried the following:

    PS C:\WINDOWS\system32> chkdsk /scan d: The type of the file system is NTFS. Shadow copying the specified volume is not supported. A snapshot error occured while scanning this drive. Run an offline scan and fix.

    PS C:\WINDOWS\system32> chkdsk /offlinescanandfix d: The type of the file system is NTFS. Windows cannot run disk checking on this volume because it is write protected.

    PS C:\WINDOWS\system32> chkdsk /scan /forceofflinefix d: The type of the file system is NTFS. Shadow copying the specified volume is not supported. A snapshot error occured while scanning this drive. Run an offline scan and fix.


    So now it is a chicken and egg scenario. chkdsk wants the file system in write mode but it will only mount the volume in read only mode until the errors are fixed with chkdsk.

    I was never able to get anything to work. I tried various other commands without luck, such as (but not limited to) chkntfs, Repair-VirtualDisk, & Repair-Volume in many different variations and steps.

    Saturday, December 16, 2017 7:32 PM

All replies

  • Can I piggy back on Josh's post as I've got the same problem with my PC where I have 8 * 8Tb disk in a Storage Space that is reporting the same error code but I've presently not had any luck getting access to the Storage Space by using Josh's method of using the following commands in Power Shell

    Get-VirtualDisk | ?{ $_.ObjectId -Match "{353523d0-ee91-11e2-be7c-bc5ff4b0cdb7}" } | Get-Disk | Set-Disk -IsReadOnly $false
    Get-VirtualDisk | ?{ $_.ObjectId -Match "{353523d0-ee91-11e2-be7c-bc5ff4b0cdb7}" }  | Get-Disk | Set-Disk -IsOffline  $false

    I set this up as a 'reliable' place to put my backups and a large number of archive files totalling over 20Tb expecting that I could swap out a failing disk as and when needed. Have I used the wrong technology? I do need to be able to access this Storage Space!

    Richard

    Saturday, December 16, 2017 11:05 PM
  • Hi Richard,

    If we are having this much trouble with Storage Spaces I suspect it may not be meant for home users or certainly isn't ready for them in the pro version of Windows. I'd hope the server versions are more robust and has more tools to diagnose/fix. I am comfortable with troubleshooting these types of things and I was not able to easily resolve the issue - there maybe something very easy I missed (I hope so!), but I certainly was not able to resolve with the gui or lots of searches.

    The ObjectId will be unique for each VirtualDisk so you won't be able to use the same one from my system. This is where I found the log in the Event Viewer:

    Applications and Services Logs | Microsoft | Windows | StorageSpaces-Driver | Operational

    I'd poke around all the Storage*, StorDiag, Disk*, & ntfs logs for errors.

    Sunday, December 17, 2017 4:01 PM
  • Dear Josh, Thanks for the pointer, I'll have a look and see what I can do probably after Christmas when I have a little more time. It looks like I've a bit of a learning curve to follow! Richard

    Monday, December 18, 2017 8:37 AM
  • Any luck?  I'm having the same issue and can't figure it out.  I did notice I'm able to write large transfers to the drive directly, but if I have a VHD on the drive, I can't write to the VHD without this problem happening.
    Wednesday, December 27, 2017 9:48 PM
  • No progress at the moment. I know someone who works for Microsoft and I'm hoping for some help from him or a colleague. I'll let you know if I get any joy.  
    Saturday, December 30, 2017 1:22 PM
  • I am also having this problem. Seems to have occurred after restarting my Homegroup which recent updates seems to have disabled. I have tried adding additional storage as requested by the StorageSpace dialog, but it won't allow additions. All my data is in limbo. I have run WD diagnostic scans on all the underlying physical drives without error. I do not have backups of this data because SorageSpaces are supposed to be fault tolerant. I am furious..

    Please help

    Tuesday, January 2, 2018 2:12 PM
  • I just got this problem today, while copying/configuring my Storage Spaces volume to share a folder with the network/HomeGroup.

    I have 4 identical physical disks, all HGST HDN726060ALE610 drives. The Storage Spaces GUI reports all physical disks are healthy, but I am seeing WARNINGs in the Event Viewer for NTFS. I'm unable to discern which of the physical disks are producing the errors. I ran the "SMART short test" on all of the drives, using the HGST WinDFT tool, and all report Passing results. 

    The Event View WARNING has: "The system failed to flush data to the transaction log. Corruption may occur in VolumeId: D:, DeviceName: \Device\HarddiskVolume10.(A device which does not exist was specified.)"

    If I click [Bring Online] the system hangs for a long time, and then eventually recovers, but the volume isn't online. I tried to use the powershell commands above and it didn't bring anything online. 

    Wednesday, January 3, 2018 2:28 AM
  • I am in the exact same scenario as you OP except I was unable to access the drive as read-only using powershell commands, it just hangs and never mounts. I was however able to access my data and copy paste it to another drive using

    https://www.cgsecurity.org/wiki/TestDisk

    It's a band-aid and not a real fix but if you have important data you need to salvage it does the job. If anyone manages to actually find a fix please let us know, thank you. 


    • Edited by Shoes51 Wednesday, January 3, 2018 7:42 AM
    Wednesday, January 3, 2018 7:42 AM
  • Same problem. Startet late Desember 2017. Any update on this?
    • Edited by brekse Sunday, January 7, 2018 7:50 PM
    Sunday, January 7, 2018 7:50 PM
  • I am furious, too. It seems to me that Storage Spaces is actually more faulty than the disks that it should provide resiliance to... So my hardware is less faulty than the software that should prevent losses from hardware faults. increadible. Unfortunately I was not able to get access to my documents as described above. Would really appreciate any other ideas.

    What I did is, I added many other disks, removed the disks of the original pool completely, then moving the data back, but the error still exists... any ideas?

    Monday, January 8, 2018 5:43 PM
  • Same issue. My storage space had been going offline around once a day - I then stopped a bunch of things that may have been causing some stress to it IO wise. Seemed to act better. Then I tried to change permissions on a lot of items at once, and it just disconnected and would never come back up. I ran stress tests on my 3 4TB WD reds and they all show 100% healthy. I'm able to easily pull data off using a 3rd party tool. I've tried adding a 4th disk at Storage Space's suggestion but no luck yet.
    Tuesday, January 9, 2018 2:30 AM
  • I have the same issue. I am working with a data recovery tool to extract the data. The data are not lost.
    Wednesday, January 10, 2018 2:14 AM
  • I had a Windows 10 (1709) storage space parity drive that went offline each time I attempted to transfer any file over about 350MB to it. In each case the physical drives reported that they were healthy and I was able to use the Storage Spaces Control Panel to bring it right back online.

    Concerned that there was, in fact, a failing drive, when I built a new computer earlier this month I replaced all 3 1TB drives with 3 new 1TB drives from a different manufacturer. New motherboard, CPU, memory, and drives. Windows 10 (1709) loaded from scratch (not upgraded or migrated in any way).  Created a 750GB parity drive, just like before.

    Same exact problem!  Copying a 456MB file works for a short time, hangs for several minutes, and then Windows reports that the path is not found.  The drive disappears from File Explorer, but it can instantly be brought back online using the Storage Spaces GUI.

    From my perspective this sure seems like a Windows 10 problem and not something directly hardware related.

    I'm not sure if this is the ideal place to put in my 2 cents worth because I have never lost data this way, but my symptoms exactly match the thread's title, and this carries from the original answers.microsoft.com thread.

    Sunday, January 14, 2018 1:22 AM
  • I'm having these issues as well, usually I manage to kick the drive back online somehow (threats, pleadings, goat sacrifices...all the usual stuff) but today it just refused and kept staying offline. Searching the issue lead me to this thread, which actually made me feel a bit less worried about the whole thing as I was concerned that I had a sthealth-faulty drive in the pool...not that a flaky storage system is much better..

    I also tried the Get-VirtualDisk commands in powershell with no luck, it would just hang for a bit and then go back to being offline just as when trying to bring it online via the storage manager.

    On a whim I repeated the -IsOffline command but instead set it to -IsOffline $true and then, again, set it to $false and lo and behold! The drive came back up instantly!

    Edit: ..it came up but in a read only state. *sigh* However, that still feels like progress.
    • Edited by Tomas_K_S Tuesday, January 16, 2018 3:09 AM
    Tuesday, January 16, 2018 3:06 AM
  • Exactly the same behaviour here. Given that all of these posts and problems seem to be within the last month or two, wondering whether an update pushed has caused this. I have had my Storage Pool in operation for several months previous with no issue. Then after a reboot, offline and readonly. Can bring online, but only in readonly.
    Wednesday, January 17, 2018 2:14 PM
  • I am experiencing problems as well, however I can bring the disk online again.

    Virtual disk {e70ac80c-d67c-11e2-be69-7054d21b10fe} has failed a write operation to all its copies.                  

    Have you tried this: 

    Get-VirtualDisk | ?{ $_.ObjectId -Match "{e70ac80c-d67c-11e2-be69-7054d21b10fe}" } | Get-Disk | Set-Disk -IsReadOnly $false                  
    Get-VirtualDisk | ?{ $_.ObjectId -Match "{e70ac80c-d67c-11e2-be69-7054d21b10fe}" } | Get-Disk | Set-Disk -IsOffline  $false

    If you only have one virtual drive, you can omit the ID match and just pipe the virtual disk object into Get-Disk.

    The above commands are mentioned in an event log event from the StorageSpaces-Driver

    Monday, January 22, 2018 7:56 PM
  • Same happening here since today. During optimalisation (I added drives) the Virtual Disk went offline with critical write error. I can only bring it back online in a read-only state and I am in no possibility to copy the data over to a new VD considering the amount of data on it. The strange thing is there is another VD in the same storage pool with no issues howsoever...

    FriendlyName   Usage MediaType   ResiliencySettingName PhysicalDiskRedundancy OperationalStatus HealthStatus IsManualAt
                                                                                                                 tach
    ------------   ----- ---------   --------------------- ---------------------- ----------------- ------------ ----------
    Opslagruimte   Data  Unspecified Parity                1                      OK                Healthy      False
    Opslagruimte 2 Data  Unspecified Parity                1                      OK                Healthy      False


    Opslagruimte is the VD that's read-only, Opslagruimte 2 has no problems so I can use it for new files. I do need write access to my old files as well though.

    And I also wonder if I should start optimalisation again since it's not even close to completed. It might fail as soon as I start it because of the write protection though, or is that a wrong statement?  :s

    Update:

    tried the following commands to get more info

    Connect-Virtualdisk -Friendlyname "Opslagruimte"

    get-virtualdisk | fl 

    ObjectId                          : {1}\\DESKTOP-U1GVUSQ\root/Microsoft/Windows/Storage/Providers_v2\SPACES_VirtualDisk.ObjectId="{973f0398-6ac4-11e6-a663-806e6f6e6963}:VD:{2633de84-c910-4a0f-a001-4730db02d1e8}{86d288ee-2d6e-4895-9d46-6e408df49cca}"
    PassThroughClass                  :
    PassThroughIds                    :
    PassThroughNamespace              :
    PassThroughServer                 :
    UniqueId                          : EE88D2866E2D95489D466E408DF49CCA
    Access                            : Read/Write
    AllocatedSize                     : 11574936862720
    AllocationUnitSize                : 268435456
    ColumnIsolation                   : PhysicalDisk
    DetachedReason                    : None
    FaultDomainAwareness              : PhysicalDisk
    FootprintOnPool                   : 17362942164992
    FriendlyName                      : Opslagruimte
    HealthStatus                      : Healthy
    Interleave                        : 262144
    IsDeduplicationEnabled            : False
    IsEnclosureAware                  : False
    IsManualAttach                    : False
    IsSnapshot                        : False
    IsTiered                          : False
    LogicalSectorSize                 : 4096
    MediaType                         : Unspecified
    Name                              :
    NameFormat                        :
    NumberOfAvailableCopies           :
    NumberOfColumns                   : 3
    NumberOfDataCopies                : 1
    NumberOfGroups                    : 1
    OperationalStatus                 : OK
    OtherOperationalStatusDescription :
    OtherUsageDescription             :
    ParityLayout                      : Rotated Parity
    PhysicalDiskRedundancy            : 1
    PhysicalSectorSize                : 4096
    ProvisioningType                  : Thin
    ReadCacheSize                     : 0
    RequestNoSinglePointOfFailure     : False
    ResiliencySettingName             : Parity
    Size                              : 17482664378368
    UniqueIdFormat                    : Vendor Specific
    UniqueIdFormatDescription         :
    Usage                             : Data
    WriteCacheSize                    : 33554432
    PSComputerName                    :


    What I found strange is that here the access is Read/Write while Windows reports it is read-only.

    Any thoughts? I might be completely off the issue since I am not experienced in this kind of stuff, so sorry if I just talk nonsense ^^

    chkdsk with no parameters on the drive returns several incorrect index entries. chkdsk /f /r /x doesn't work since the drive is read-only so no way to fix those afaik. repair-volume command gives same return (as expected)


    • Edited by Orozar7 Friday, February 2, 2018 3:03 AM Update
    Friday, February 2, 2018 12:28 AM
  • I ran into the same 'The disk is offline because of critical write failures' issue and scoured the internet for a fix.  3 x 6TB drives unreachable. Offline in Disk Manager, and received the following error when attempting to bring the virtual disk back online (per MS TechNet advice), 'The Object invoked has disconnected from its clients'.  Before running additional scan tools as some advised, and before touching the PC to check/replace cables, etc.  I remembered what changed since the disks were last healthy.  The Windows 10 Creators Update, or whatever the most recent massive update was called.  This backup PC with internal drives hadn't been updated until yesterday 2/2.  On checking for the storage space the first time on 2/3...gone.  Solution (in this case),  I checked all appropriate or associated Windows services for stopped/running and the startup configuration for each (auto/manual).  The 'Microsoft Storage Spaces SMP' service was not "running" (stopped) and was set to 'Manual'.  Description of the service caught my eye, 'Host service for the Microsoft Storage Spaces management provider. If this service is stopped or disabled, Storage Spaces cannot be managed.'  Opened properties and changed the 'Startup type' from Disabled to Automatic. Started the service, and the large virtual disk in Disk Manager now no longer showed angled lines thru the drive within the UI.  Right clicked on the drive, selected 'Online', and the virtual disk is back and functional.  A lot of text to say, verify all appropriate Windows services are running before you break open your box or attempt emergency disk repair/file transfers.     
    Saturday, February 3, 2018 4:02 PM
  • Hello,

    Same issues for more than a month.

    First Windows crashed when i started to copy/delete some folders on a storage space using 3 disk, including 2 recent drives (WD blue). Windows wouldn't start, I had a black screen before login screen for ages. I tried to connect the 3 hdd on another machine without success, they were not recognized as a storage pool, even if widows seen them. I started to think that they were dead, so i tried different solutions. finally I ended up to let windows on the black screen, it finally went after 20 min. Then, i replaced 2 of the 3 drives (the WDs), because i thought they were dead, even if speedfan said it was ok. One by one, i changed the 2 drives with 2 new seagates, and rebuilt the raid.

    Everything seems to work normally, but i noticed then a huge lack of performance. Windows was lagging everytime I copy/delete a file or a folder. I tried to desactivate some services like prefetch, superfetch, etc. Performances were nice fore maybe 2 days, and then, everytime i write on the drive, it went at 100% occupency with 0 transfert, explorer not responding, and the drive went offline after 5 minutes.

    And I ended up on this topic, and I finally understood that was not only me... I tested the 2 WDs, and they are perfectly fine... Well, now, I succeed to mount the storage space in read only, because mount it in write mode just crash, 100% occupency with no response for 5 min and then it goes off. And I'm bacjuping my datas on my 2 free drives...

    I really hope that microsoft will fixe this bug because this is very hard to lose time like that... I have so many things to do after work, I really don't like losing time because of some bug on an old feature that I use since windows 8, threating my personnal datas...


    • Edited by Censseo Thursday, February 8, 2018 8:46 AM
    • Proposed as answer by NKatsaras Friday, August 24, 2018 2:18 AM
    Thursday, February 8, 2018 8:44 AM
  • Wanted to report the same issue as everyone else here. Running 4x4TB HGST drives in parity, formatted as REFS with file integrity enabled. Have been stable for months and now just started doing this after a period of high I/O. I did get the "Better every time" popup right around same time this happened so it does seem like an update caused it. Running Windows 10 Enterprise Build 1709. The drives are a few months old and have been extensively tested in 8 full passes of read and write prior to using them in Storage Spaces. 

    EDIT: Seems like this issue is limited to Parity drives only? I recently converted from mirrored to Parity and it only happened on Parity. All the reports are for drives set to parity.

    Friday, February 9, 2018 4:46 AM
  • I am having the same issue - all drives show healthy, but as soon as i bring the drive online in disk manager i get 1-2 minutes and then it dies.  v1709, build 16299.192.  3 x HGST HDN724040ALE640 setup in a parity config.
    Friday, February 9, 2018 5:50 AM
  • There are many different posters in this thread.

    While the underlying problems are similar there may be problems unique to one computer.

    Some should open new threads so that a deeper evaluation can be performed.

    Sunday, February 11, 2018 6:01 AM
  • I was in the same boat...critical write failures and couldn't perform chkdsk due to read only.  Am now back up and running.  **Before getting started, make sure to read and follow all manufacturer guidance and safety procedures regarding connecting/disconnecting drives.**  I randomly selected one of my 3 drives (if the first one hadn't worked, I was going to try it with every one) that are in parity and disconnected it.  I booted back up and the storage space was connected, in read/write mode, and looking good.  When I went to storage spaces in control panel, as expected, it showed 1 drive missing and had some warnings.  I shutdown the pc, reconnected the drive and booted back up.  Storage Space is attached, all drives showing ok in control panel, and storage spaces began repairing the disk.  Once that was done, I kicked off a chkdsk /f /r /x   Good luck!
    Monday, February 12, 2018 4:59 AM
  • Hi All,

    Has anyone found an answer to this on how to diagnose and bring the drives back online?

    I have came across the same issue however thinking that the drives were getting a bit old they may have been on the way out and I was able to remove them and add new-ones to the pool without error.  

    So far I have performed the following:

    1) Copied files onto external storage without any issues 

    2) Performed tests on the drives using manufacturers diagnostic software, no issues found.

    2) Added a brand new drive to the pool. Optimisation performed successfully no errors still unable to online.

    3) removed the 3 Samsung drives individually using "prepare for removal" and replaced with new Seagate drives. Process completed successfully without error. Had attempted to online the pool after each removal/addition but still the same result.

    4) performed a clean install of the MS Windows with the drives disconnected, again once this had been completed and the drives were reconnected the system took 10+ minutes to boot. 

    I am now looking at a storage pool with 4 brad new drives that I am unable to online. Again the files on the pool are visible and fro the tests I have performed appear to be I one piece.

    The only thing I have not done yet is to delete the pool, which I am reluctant to do as these are new drives. 


    Sunday, February 18, 2018 8:19 PM
  • Same here!
    Wednesday, February 21, 2018 1:33 PM
  • I've got the same problem with my PC where I have 5 * 3Tb disks in a Storage Space using parity, reporting the same error code but I've presently not had any luck getting access to the Storage Space by using Josh's method or any of the methods listed in this thread.  All drives report up and fully operational.  I've brought them offline and back online but I was never able to get anything to work.
    Friday, February 23, 2018 3:48 PM
  • Hi, guys.

    I think Powershell cmdlet does not work but diskpart works well.

    Try to use the following steps to bring the drive to online and writable.

    1. Open the command prompt as administrator
    2. Enter diskpart
    3. Enter listdisk. 
    4. Enter select disk x. 
      x is number of your offline viritual disk.
    5. Enter online disk
    6. Enter attributes disk clear readonly
    7. Enter Exit
    8. run chkdsk X: /r
      X is drive letter of your virtual disk.



    Masahiko Sada
    Microsoft MVP for Cloud and Datacenter Management http://satsumahomeserver.com

    • Proposed as answer by timnelsonic Friday, October 19, 2018 5:31 PM
    Sunday, March 11, 2018 9:07 AM
  • Throwing another log on the fire.

    I had 6 2tb disks in a parity space for a 6 months and they were working fine until 1709 showed up. Since then, they'll offline due to critical write failures. I've been testing pools and individual drives and everything is fine until I try to load a VM up. I can install a VM and latency jumps up to 2000ms at times. After the post install reboot, the VMs just sit there on the Hyper-V loading screen and bluescreen when IO drops to zero.
    Thursday, April 5, 2018 12:10 AM
  • attribute disk clear <g class="gr_ gr_13 gr-alert gr_spell gr_inline_cards gr_run_anim ContextualSpelling" data-gr-id="13" id="13">readonly</g> failed, no error message, and disk array is set back to offline

    Only find out when <g class="gr_ gr_172 gr-alert gr_spell gr_inline_cards gr_run_anim ContextualSpelling" data-gr-id="172" id="172">chkdsk</g> fails

    Sunday, April 15, 2018 7:42 PM
  • I want to just add another occurrence. I have 4 drives in a parity storage space formatted as ReFS (chkdsk will not run on this). Have run extended disk scans on all drives to ensure they are functioning properly. They are not the same model or size (4TB, 4TB, 5TB, 5TB).

    When I try to write a large amount of data at a time, as others have mentioned around a couple hundred MB, the volume will peg to 100% disk usage according to task manager and then offline itself with the specified error. I can actually bring it back online and it functions fine until I try to do a large write again.

    I'm running Windows 10 Enterprise v1709. Build 16299.371

    Saturday, April 21, 2018 5:06 AM
  • Thanks for the Command Prompt sequence.

    At step 6, the drive goes offline again. I repeat 1-5 and it allows copies (to backup) but won't allow writes.

    When I used "Repair-VirtualDisk" I get the following...

    ------- ------- ------- 

    PS C:\Users\Ed> Repair-VirtualDisk ourNAS01_pool
    Repair-VirtualDisk : Access denied
    Activity ID: {8cd68d25-e58e-46e4-806d-dab700110e60}
    At line:1 char:1
    + Repair-VirtualDisk ourNAS01_pool
    + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        + CategoryInfo          : PermissionDenied: (StorageWMI:ROOT/Microsoft/...SFT_VirtualDisk) [Repair-VirtualDisk], CimException
        + FullyQualifiedErrorId : StorageWMI 40001,Repair-VirtualDisk

    ------- ------- ------- 

    Any Ideas on troubleshooting or repair?

    Thanks,

    Ed

    Monday, April 23, 2018 8:29 PM
  • I was in the same boat...critical write failures and couldn't perform chkdsk due to read only.  Am now back up and running.  **Before getting started, make sure to read and follow all manufacturer guidance and safety procedures regarding connecting/disconnecting drives.**  I randomly selected one of my 3 drives (if the first one hadn't worked, I was going to try it with every one) that are in parity and disconnected it.  I booted back up and the storage space was connected, in read/write mode, and looking good.  When I went to storage spaces in control panel, as expected, it showed 1 drive missing and had some warnings.  I shutdown the pc, reconnected the drive and booted back up.  Storage Space is attached, all drives showing ok in control panel, and storage spaces began repairing the disk.  Once that was done, I kicked off a chkdsk /f /r /x   Good luck!

    I googled the hell out of this issue, trying to come up with a way to bring the spaces parity online in write mode!! I had 30 <g class="gr_ gr_52 gr-alert gr_spell gr_inline_cards gr_run_anim ContextualSpelling" data-gr-id="52" id="52">tb's</g> over 8 <g class="gr_ gr_46 gr-alert gr_gramm gr_inline_cards gr_run_anim Grammar multiReplace" data-gr-id="46" id="46">disk</g> in my storage spaces that was 'lost' because of a windows crash during a delete. This worked for me!!

    I will just add my flavour as I had a few disk it wasn't practical for me to unplug each <g class="gr_ gr_119 gr-alert gr_spell gr_inline_cards gr_run_anim ContextualSpelling ins-del multiReplace" data-gr-id="119" id="119">hdd</g>, so I just disabled each hdd via device manager and rebooted until the spaces came back online.  Then once I identified it, I <g class="gr_ gr_261 gr-alert gr_spell gr_inline_cards gr_run_anim ContextualSpelling ins-del multiReplace" data-gr-id="261" id="261">renabled</g> the HDD and ran <g class="gr_ gr_301 gr-alert gr_spell gr_inline_cards gr_run_anim ContextualSpelling" data-gr-id="301" id="301">chkdsk</g>.

    Very glad this now operational!

    Monday, May 21, 2018 1:02 PM
  • Same issue here, got critical write error from time to time, can bring it back online via GUI, it's very annoying.
    Monday, July 30, 2018 4:42 PM
  • Another one here. 4 x 4TB drives with parity.

    Been going offline for a couple of months now. Can bring it online again, but will go off again..\

    Drives apparently healthy.

    Tuesday, July 31, 2018 3:55 PM
  • I also have 3x3TB (ST3000DM008-2DM166) drives in a storage spaces parity pool, and intermittently experience this same issue as many others here have also been writing about.  These are all brand new drives which test perfectly well and only have this issue which seems like a software bug.

    Log Name:      Microsoft-Windows-StorageSpaces-Driver/Operational
    Source:        Microsoft-Windows-StorageSpaces-Driver
    Date:          8/23/2018 9:56:06 PM
    Event ID:      312
    Task Category: None
    Level:         Error
    Keywords:      
    User:          SYSTEM
    Computer:      Starship01
    Description:
    Virtual disk {4b888fba-68aa-47c8-aac9-4d986b573232} has failed a write operation to all its copies.                  
                      
    You can online the disk by using the following commands in PowerShell:                  
                      
    Get-VirtualDisk | ?{ $_.ObjectId -Match "{4b888fba-68aa-47c8-aac9-4d986b573232}" } | Get-Disk | Set-Disk -IsReadOnly $false                  
    Get-VirtualDisk | ?{ $_.ObjectId -Match "{4b888fba-68aa-47c8-aac9-4d986b573232}" } | Get-Disk | Set-Disk -IsOffline  $false
    Event Xml:
    <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
      <System>
        <Provider Name="Microsoft-Windows-StorageSpaces-Driver" Guid="{595F7F52-C90A-4026-A125-8EB5E083F15E}" />
        <EventID>312</EventID>
        <Version>0</Version>
        <Level>2</Level>
        <Task>0</Task>
        <Opcode>0</Opcode>
        <Keywords>0x8000000000000000</Keywords>
        <TimeCreated SystemTime="2018-08-24T01:56:06.362844900Z" />
        <EventRecordID>36</EventRecordID>
        <Correlation />
        <Execution ProcessID="4" ThreadID="3900" />
        <Channel>Microsoft-Windows-StorageSpaces-Driver/Operational</Channel>
        <Computer>Starship01</Computer>
        <Security UserID="S-1-5-18" />
      </System>
      <EventData>
        <Data Name="Id">{4B888FBA-68AA-47C8-AAC9-4D986B573232}</Data>
        <Data Name="Status">3221226603</Data>
      </EventData>
    </Event>


    Friday, August 24, 2018 2:11 AM
  • I could not get my volume back online, but was able to use R-Studio to recover all the files. While I am copying the files to another device I purchased an adaptec hardware raid controller because I don't trust storage spaces anymore, plus the write speed with raid5 is terrible (~20MB/s) with 4 drives.
    Sunday, September 2, 2018 11:44 PM
  • The drive is accessible after completing number 5, though it has trouble navigating some directories due to not having permissions.  Step 6 causes it to force offline again.
    Saturday, October 6, 2018 9:40 PM
  • Same problem on Windows 10 and Windows Server 2019.

    Windows Server 2016 worked fine. Please fix.

    Thursday, November 15, 2018 8:33 AM
  • Same problem here, Win Server 2016, parity disks

    to get my data off the disk i made following:

    Use "get-disk" to get the name from the offline disk, in my case its name is "Data"

    i used the disk management to bring it online

    then (quick)
    get-disk -FriendlyName data | set-disk -IsReadOnly $true
    to set the disk to readonly

    get-disk -FriendlyName data | select IsReadOnly
    to check its status

    when its now online and read-only, its status is stable and i was able to copy the files to another storage.

    Steve from Vienna

    Friday, November 23, 2018 7:31 PM
  • Same here:

    I recently created a storage space in Windows 10 with 3 x 4TB drives. Once in a while, the volume disappears entirely from Windows. I have my Users profile folder on the drive so when it happens, everything starts crashing.

    Last night it happened and I had to restart the computer 5 times before the volume came back by itself and the profiles started working again.

    The storage space is perfectly healthy when working. All 3 drives are in perfect health according to SMART status.

    I am worried about losing my data one day which entirely defeats the purpose of a storage space with redundancy! It has happened maybe 3 times in the past 3 months.

    Error in event viewer:

    Virtual disk {25d45799-eda3-4ca5-9b65-8a296209e768} has failed a write operation to all its copies.                  
                      
    You can online the disk by using the following commands in PowerShell:                  
                      
    Get-VirtualDisk | ?{ $_.ObjectId -Match "{25d45799-eda3-4ca5-9b65-8a296209e768}" } | Get-Disk | Set-Disk -IsReadOnly $false                  
    Get-VirtualDisk | ?{ $_.ObjectId -Match "{25d45799-eda3-4ca5-9b65-8a296209e768}" } | Get-Disk | Set-Disk -IsOffline  $false

    Hardware: MSI 970 Gaming ATX AM3 970 DDR3 Motherboard

    Drives: Hitachi HGST HDN 726040ALE614 and 2 x WDC WD40 02FYYZ-01B7CB1

    Storage Pool Type: Parity

    OS: Windows 10 Pro Version 1803 Build 17134.523


    • Edited by integriz Wednesday, January 30, 2019 1:16 PM
    Wednesday, January 30, 2019 1:14 PM
  • +1 same issue here.

    Was moving files / folders around within the storage pool when the computer (Dell XPS running Windows 10 Pro (1803)) keeled over.  Six hours finding a way to at least see the data. Off to buy new disks and a hardware RAID card tomorrow it seems, Storage Spaces aren't up to the job.  

    Thanks to those who've commented, so glad you've given tips on how to recover data without resorting to slow backup restores.

    Wednesday, February 27, 2019 9:37 PM
  • I face the same Isse here as well and I had the same years ago with Server 2012 R2 and 2016, so the issue is still the same in my new build with Server 2019.

    First I used 4 TB drives WD Greens, up to 6 in parity. Here my volume turned offline randomly. Now im doing a pool with 4x12TB WD HC520, 2x 256 GB SSDs.

    1 Volume Mirrord Tiered

    1 Volume Parity Tiered.

    While all share the same drives, only my parity volume appears to be offline randomly. I can easily bring it back online instantly with "bring online". But it sucks so much. Wonder who this crap made it though QS.

    Storage Spaces in Windows Server is the same crap then on W10. Absolutely unreliable. Thats why i used to use my LSI Raid controller instead but I thought i can give Storage Spaces a chance again. I was wrong.

    Mirror as well as parity is ReFS. So no checkdisk.

    Edit: I just updtaed my Internal LSI to P20 with this image

    https://forums.servethehome.com/index.php?threads/upgrade-lsi-2116it-in-x10sdv-4c-7tp4f-to-p20.14211/

    and i set the IO and Removed device delay to 1 instead of 0 in the LSI configuration. Lets see if that helps.


    Thursday, March 7, 2019 12:59 PM
  • Okay. Still same issue. Means, it's a bug in my opinion. Have had it in 2012, 2016 and 2019 server, many different drivers, different disks, even with a controller delay on disk remove and a mirror on same disk parity volume randomly goes off... Practiably not usable in a multi thousand dollar OS. If you need any information to figure out  please let me know. I can easily grant access to that affected server and volume... No issue for me. 
    Saturday, March 9, 2019 9:40 PM
  • 1) Run HD Tune:  (free version)

    https://www.hdtune.com/

    Post images into the thread for results on these tabs:

    a) Health (SMART)

    b) Benchmark

    c) Full error scan

    2) Run HD Sentinel (free or trial version)

    https://www.hdsentinel.com/

    Post an image of the overview tab results into the thread

    3) Run Sea Tools for Windows long generic:

    http://www.seagate.com/support/downloads/seatools/seatools-win-master/
    http://knowledge.seagate.com/articles/en_US/FAQ/202435en

    4) Were you able to check other chkdsk switches on the drives?

    chkdsk /r /v C:

    (This may take many hours to run so plan to run overnight or when there is significant downtime)

    Use the syntax chkdsk /r /v C: or chkdsk /r /v D: to run the command on each drive.


    C:\WINDOWS\system32>chkdsk /r /v C:
    The type of the file system is NTFS.
    Cannot lock current drive.

    Chkdsk cannot run because the volume is in use by another
    process.  Would you like to schedule this volume to be
    checked the next time the system restarts? (Y/N)

    Type:  Y

    reboot




    Sunday, March 10, 2019 1:02 AM
  • Windows 10 Ent 1903, 6 drives in a Parity pool.

    TLDR: Check System event log for event 153 to ID bad drive.  Replace drive cable

    I had this issue with my Parity storage space.  It started to go offline after some intensive write operations.  A reboot would bring it back online and functional.  After a few repeats of this I investigated System even log and noticed a number of Event 153 errors for a particular drive.  This helped me ID what drive was at fault.

    When the pool was online I attempted to remove that drive from the pool.  That process to preparing drive for removal started.  After a few minutes my system would hard crash/freeze, no blue screen and after about 2-3 minutes hard power down.  Yikes.  More 153 errors in log.  I'd reboot and still status of preparing for removal.  Sometimes the Storage Spaces mgmt window would close itself, and then system hard crash.  Sometimes it would just hard crash. 

    I then physically disconnected the drive.  Storage pool was ok.  I could have just proceeded to replace the drive I believe with no data loss (that's the idea right of this whole Storage Spaces thing right?!).  However I plugged it back in then to see if it would be ok.  It was, and slowly I saw the % used of the drive go down.  It finished the prep process ok. 

    Once removed I ran HD Tune mentioned above.  It highlighted that the Interface CRC Error Count was high.  It mentions that this is often due to a bad cable.  I replaced that SATA cable and did many drive health checks with little indication the drive is bad.  So I'm chalking this all up to bad/loose cable

    Friday, December 6, 2019 8:30 PM
  • Sorry to reply so much later than your post but just want to say this is gold, this solution worked for me fine. I'm not going to thrust that volume and will migrate all my data off but saved me losing a few days worth from having to revert to my monthly backup (im just a home user)

    Saturday, March 14, 2020 6:41 PM
  • I know that this is an old post, but this is the EXACT thing that has happened to me. I have two storage tiers:

    1) SSD Tier (Consisting of a pair of Intel data center NVMe drives)

    2) HDD Tier (Consisting of 12  HGST SAS drives)

    My volume is ReFS so I cannot do a CHKDSK on it. I can online the pool and access my volume just fine in read and write mode, but when I write a large file to it, it goes offline. I am running a fresh, clean version of Server 2019 Standard. Please advise.
    Wednesday, June 3, 2020 3:11 AM