Evicting disks from a Storage space RRS feed

  • Question

  • There's a new option when managing physical disks in Storage spaces in the RC, Evict. Will it migrate all data on the drive to other disks (providing there is enough space in the pool) before removing it or will it simply just drop the disk from the Storage space (which would cause all non-parity/unmirrored drives to fail) ?

    Saturday, June 2, 2012 6:13 AM

All replies

  • Hi,

    If you use Server Manager to perform this operation, you'll get either warnings or actually get blocked depending on the specific case. i.e. , if the disk holds an extension of a simple disk trying to perform this operation from the Server Manager interface will be blocked, at least up until you remove the simple disk extension from that physical disk and try to remove the physical disk again.


    • Proposed as answer by zroiy Sunday, June 3, 2012 5:56 AM
    Sunday, June 3, 2012 5:54 AM
  • I don't really understand this reply. But I'd love to know the answer to the original question. If it is the first (migrate data to other drives), that would be fantastically useful.

    More broadly on Storage Spaces, we need better summary information about actual disk space used for each physical disk. I think there is a simple screen in the client Windows 8 that shows % disk space used by data. That would help you choose a less used to disk to evict, to minimize the time taken to migrate data. But I don't see an equivalent screen in Server 2012 RC. Right now, you have to go into each disk separately to see the amount of disk space used. EDIT: this does exist by adding columns to the view, see below.

    Finally, disk buses don't seem to be always correctly identified.

    * SATA disks on PCI-X RAID controllers (2x RAIDCore BC4852) show up as being on a "Fibre Channel" bus. This should really be "SCSI" or "RAID" - they appear to the OS as SCSI disks.

    * SATA disks on a Supermicro AOC-SASLP-MV8 controller (which is just an 8-port SATA PCIe card) show up as being on a "RAID" bus - even though there is no RAID capability for this card - it is JBOD only.

    * SATA disks on the motherboard (in Intel RAID mode) show up as "RAID" bus type - which is probably to be expected.

    Sunday, June 3, 2012 8:30 AM
  • If not for eviction without replacement, Storage Spaces really needs at least functionality to switch out a drive. By that I mean the ability to specify an unused/new spare drive of same or larger size, to which then all active slabs are moved over live, and at the end, the old drive gets dropped from the pool. Smaller servers in SoHo situations or prosumer computers can only carry a certain practical amount of disks. If you want to increase the pool size, you'll get to replace disks with bigger ones at some point. Having a single simple virtual disk mess this process up is kind of not acceptable. Even if certain virtual disk is a temp or scratch disk only.

    I guess there's the benefit of the doubt for a first release, but if a feature like this is planned to be added, please consider delivering it out of band in a service pack. Three years to the next release is too long a delay.

    Sunday, June 3, 2012 12:55 PM
  • You can actually see the disk space used for each physical disk, either do properties on the disk in the Physical Disks view or add the column to that view. 
    Monday, June 4, 2012 4:48 AM
  • What I have seen so far: you can replace a physical disk as long as there is no Simple Disk present in the Pool.

    My quick-Test:

    - Create pool with 3 physical disks

    - Create a parity disk

    - Create a Volume on it

    Now to replace it with a different drive:

    - Select the disk you want to replace and choose "Evict" (Parity disks changes to Warning state)

    - Add the new larger physical disk to the pool

    - The option "Remove"  will eventually become availbable on the evicted disk (probably depends on amount of data)

    I agree a simpler "Replace disk" option could/should be made to replace a disk in SOHO scenario's that also works with simple volumes.

    Monday, June 4, 2012 10:45 AM
  • I guess one SOHO and prosumer scenario might be switching out a failing disk (either with bad SMART values or already failing).
    Monday, June 4, 2012 11:58 AM
  • You can actually see the disk space used for each physical disk, either do properties on the disk in the Physical Disks view or add the column to that view. 

    In "properties" in the physical disk, I see the total disk space allocated to the storage space. This is typically the entire disk. But what I'm after is the actual amount of space on the physical disk that is currently being used for active slabs. That is what would drive the time taken to copy the data to a replacement disk.

    EDIT: sorry, you're right, it is the amount of disk space actually used. The entire disk is dedicated to the pool. And by right-clicking on an active column (not the blank space to the right of the columns), as you say, you can add other columns to the Physical Disks view, including used space.

    Monday, June 4, 2012 7:38 PM
  • Here are some details based on my testing.

    Once you chose "Evict", it prompts you that redundancy will be lost. If you confirm, the parity virtual disk changes to "Degraded". You can then choose "Repair Virtual Disk" and the data starts being rebuilt from parity to a new disk you added to the pool. The virtual disk changes to "InService", which means it is rebuilding.

    But in the meantime, the old disk has the option "Remove Disk" available. It shows Usage = Retired and used space as near zero. And the pool shows the full available free space of the old disk. So all this strongly implies that the "Evict Disk" seems to immediately remove the redundancy of the virtual disk by freeing the old disk, rather than migrating data. If so, the virtual disk would not be redundant until the repair finishes. And so, if you lose a disk in this period, you lose the virtual disk.

    On an 8x 500GB 7200rpm parity virtual disk, it rebuilds the new disk at ~11MB/sec on my system. So that's about 12 hours to complete, which is not particularly fast. It seems like it should be able to rebuild several times faster than that, as it doesn't appear bottlenecked by disk sequential read/write speed or on a CPU core. This is a concern given during this period the virtual disk is not redundant.

    However, this is not the entire story. If you select another disk to Evict while the the rebuilding is in progress, then you can also evict that one and it is immediately also marked as Retired. However, the second evicted disk still shows all its disk space being used, while the first evicted disk shows it is completely free. So it looks like the second eviction is queued up to happen immediately after the first rebuild completes. So at least you don't lose the virtual disk if you accidentally evict a disk that would destroy the virtual disk. But the rebuilds happen in series. There doesn't look like a way to run a parallel migration/reconstruction of the virtual disk to more than one drive (e.g. like you can with my 8-year old RAIDCore RAID controllers).

    However, what happens next is really interesting. Once the first rebuild is complete, it then looks like it automatically starts to migrate data from the second old disk to the second new disk. Partway through, the old disk shows 347GB used, the new disk shows 118GB (465GB-347GB) used, and this changes over time in occasional many GB increments. So it looks like it might be migrating data from the evicted disk without losing redundancy. This seems to run much slower than the first rebuild (several days to migrate data from one 500GB disk to another on my system). The virtual disk is still marked as Warning, and it's not at all clear if it is actually redundant or not at this point. The disk usage would imply it is, but there's no way to tell.

    But if it does stay redundant through the rebuild, it's not at all clear why it immediately freed up all of the first disk I evicted, rather than migrating it as per the second disk. Storage Spaces is a complete black box on what is actually going on. We really need "% complete", "estimated completion time", and better status information (see http://social.technet.microsoft.com/Forums/en-US/winserver8gen/thread/57c4ad17-b95e-4017-a826-aa8de4ed554d)

    It does seem like there should be no good reason why SS couldn't migrate the data without losing redundancy, so long as all the old and new disks are available. It seems to sometimes do this, but not always. That would also allow migration of simple disks where there is no redundancy. It should also allow you to run simultaneous migrations to two or more new physical disks for parity spaces, which can't be done with single parity and loss of redundancy.

    Clearly the architecture seems to allow this, by separating "Evict Disk" from "Remove Disk". But it doesn't seem to be implemented yet. At least not for parity virtual disks in Server 2012 RC.

    See also here for more on this: http://social.technet.microsoft.com/Forums/en-US/winserver8gen/thread/dbbf317b-80d2-4992-b5a9-20b83526a9c2

    Friday, July 20, 2012 2:39 AM
  • What I have seen so far: you can replace a physical disk as long as there is no Simple Disk present in the Pool.

    When I came across this post I was so relieved... until I tried it on my system.  It may have worked for you, but for me I still can't remove a failed disk.

    I have nine disks added to a pool, and 11 virtual disks created from the pool - all of them set up in parity mode. No simple or mirror mode disks in the pool, and not able to remove a failed drive.

    • Edited by ECase Saturday, October 27, 2012 4:18 AM
    Saturday, October 27, 2012 4:18 AM
  • I have an answer for anyone else unfortunate to come across this problem later on:

    Step One: Don't Panic.... there is fix :) 

    Step Two: Offline all the virtual disks in the pool.  Caveat: Make sure the disks are not in use prior to taking offline.  This means shutting down VMs and/or anything else you may have running on the disks. So much for always available, but that's minor compared to losing an entire pool.

    Step Three: Repair the virtual drives as needed.  You may need to detach them.

    Step Four: Remove the failed drive and replace when able.

    Step Five: get on with your life.... :)

    Saturday, October 27, 2012 5:05 AM
  • I have been looking for a way to do this for a while now nice RC. This massively reduces the usefulness of storage pools if this is the kind of ballache that you need to do just to remove a failed drive
    Saturday, October 27, 2012 7:51 AM