Storage Spaces Overview

Storage Spaces Overview

Introduction

 

What is Storage Spaces

Storage Spaces is a new virtualization capability in Windows 8 and Windows Server 2012 which enables users to dramatically reduce the cost of highly available storage for virtualized or physical deployments, while also providing high resiliency and operational simplicity.

Storage spaces come with features that provide resiliency, scalability, high availability and ease of administrative operations.

Storage Spaces Model



Introduction to Storage Spaces

Storage Spaces gives administrators the option to deploy a resilient and highly available storage system utilizing cost-efficient hardware – commodity SAS JBOD (Just a Bunch Of Disks) enclosures, and commodity SAS adapters. Storage Spaces introduces platform storage virtualization, enabling deployments with business-critical data needs across a variety of scenarios, while providing suitable features for the appropriate set of workloads.

With this platform storage virtualization are introduced two new virtualization primitives: storage pool, and storage space:

  • Storage pool: Provides a means to administer, aggregate, and isolate a set of physical disks.
  • Storage space: Provides an abstraction for resiliency, provisioned capacity, and performance level.

Storage Spaces Capabilities


Storage Spaces offers the following capabilities and features:
  • Resilient storage:
    • Storage Spaces provide resilient storage in the form of mirror spaces with dirty region tracking, and parity spaces with integrated journaling. Storage spaces also provide rapid recovery via per-pool hot spares.

  • Failover clustering:
    • Windows’ Failover Clustering is a feature that enables multiple servers (known as cluster nodes) to be formed into a cluster which provides high availability for the underlying storage and supported workloads, such as File Server and Hyper-V. Failover Clustering supports storage based on Storage Spaces utilizing commodity shared SAS JBODs, or a more traditional shared storage appliance.

  • Thin provisioning:
    • Thin provision allows administrators to have straight-forward storage planning, focusing on the minimum number of physical drives to deploy and the near-term size of their datasets.

  • Administrative ease and simplicity:
    • Storage Spaces provides admin with the better optimized and flexible capacity utilization, with ease of configuration and management through Power Shell, Control Panel Applet (Windows 8) and Server Manager UI (Windows Server 2012), with easy scalability of storage pools and storage spaces, and with effective utilization of hardware investment.

     

Resilient Storage


User data is increasingly valuable across all customer segments. Data loss precipitated by drive failures often has significant and expensive consequences. Storage Spaces delivers resiliency through a number of mechanisms including:
  • Mirror Spaces: Stripes and mirrors data across a set of pool disks, supporting a two-way or three-way mirror, which are respectively resilient to single disk, or double disk failures. Suitable for the majority of workloads, in both clustered and non-clustered deployments.
  • Parity Spaces: Stripes data across a set of pool disks, with a single disk write block used to store parity information, and is resilient to a single disk failure. Suitable for large block append-style workloads, such as archiving, in non-clustered deployments.

The mirror and parity resiliency delivered by Storage Spaces allows administrators to still have access to their data even after a drive failure has occurred. This allows enterprises to deploy industry-standard storage hardware and save costs, without any degradation in tolerance to drive failures and power loss events.

Apart from mirror spaces and parity spaces, admins also have the option of spaces which only need performance and do not need resiliency:

  • Simple Spaces: Stripes data across a set of pool disks, and is not resilient to any disk failures. Suitable for high performance workloads where resiliency is either not necessary, or is provided by the application.


Administrative Control Over Resiliency Values


To ease the creation of pools and spaces, storage spaces allows the administrator to set default resiliency values, such as number of columns and stripe interleave, for individual storage pools. This aids the use of a tuned set of resiliency characteristics on all the spaces that will be created. When an administrator creates a pool and omits these details, Storage Spaces will use the defaults set on the pool.  An administrator will however need to define the preferred resiliency type, simple, mirror, or parity, when creating a space.


Power Fault Resiliency

Mirror Spaces:

Mirror spaces have the following crash resiliency features:

  • Dirty Region Tracking: Storage Spaces enables targeted, rapid re-syncing of mirrors by tracking dirty regions (regions of a mirror space which have in-flight data).  Hence, mirror spaces quickly recover from a power loss or similar server fault event, by re-syncing only those regions.

  • Scrubber: Scrubber provides a way to fix latent inconsistency issues. With NTFS, scrubber compares multiple copies of data and looks for inconsistencies. This way, scrubber ensures mirror copies remain consistent. With ReFS, scrubber goes further by ensuring that if a corruption occurs to a subset of copies, it corrects the corruption from a good copy.

Parity Spaces:

Parity spaces have the following crash resiliency feature:

  • Journal: In the event of power loss and similar server fault events, the integrity of parity spaces is protected through journaling. Storage Spaces ensures that at-rest data is not affected by in-flight overwrites by always journaling new data and parity. Hence, parity spaces quickly recover from a power loss or similar server fault event by replaying the journal.

 

Flexible Resiliency Repair

Drive failures in a pool can degrade resiliency in contained storage spaces. Upon a drive failing and one or more spaces becoming degraded, Storage Spaces automatically starts repair of the affected spaces, provided there is sufficient spare storage capacity in the pool.

If the pool has a hot spare drive, Storage Spaces will automatically use it to provide capacity for repairing degraded storage spaces.

 

Rapid Failover Processing

Under clustered environment, given a single server scenario with a two node cluster, if one fails, Storage Spaces makes it possible for the administrator to be able to quickly access the data in all spaces via the one remaining cluster node. Upon failing over to another cluster node, the space comes online and recovery actions happen in the background, enabling quick failover processing in a cluster. 

Failover Clustering

 
Enterprises are increasingly consolidating many workloads into few servers. The value of Failover Clustering is realized in the moment of a server failure. With Failover Clustering, a server failure could results in workloads automatically moving to other servers in the cluster where they can continue to run. Storage Spaces supports Failover Clustering and makes it possible for enterprises to deploy cost-efficient industry standard storage.  Enterprises can consolidate multiple workloads onto the cluster, while being confident that their datasets will be available despite server failure.

For more information on Failover Clustering, see the following topic:

  • Failover Clustering

http://technet.microsoft.com/en-us/library/cc725923

 

Cluster Deployment Element


Storage Spaces in conjunction with Failover Clustering can deliver a resilient, highly available, and cost-efficient solution which can be scaled from simple deployments to the needs of a large datacenter. The basic building block of a storage space cluster deployment is a small collection of servers, typically 2-4, and a set of shared SAS JBOD enclosures – this building block is called a Deployment Element. The JBOD enclosures are connected to all of the servers in the deployment element with data access resiliency as noted in a prior section such that each server has redundant paths to all of the disks in each JBOD enclosure.

Regardless of the number of servers, the number of JBOD enclosures, or the number of provisioned virtual disks, Cluster Shared Volumes (CSV) atop of NTFS unifies storage access into a single namespace for ease of management, where all servers in the cluster can concurrently access all of the cluster shared volumes. This unified namespace enables highly available workloads to be transparently failed-over to another server in the event of a server failure, or simply moved off a server to allow for a maintenance operation.

By using the simple building blocks like the Deployment Element, administrators can easily build a cluster for their workloads, either physically running on the servers, or those virtualized through Hyper-V. In either case, running workloads that support high availability can quickly failover to another server in the event of a server failure or maintenance need.

Storage spaces with clustering and CSV while using commodity storage can allow admins to have a cost-effective, scalable, and manageable solution.

The below diagram illustrates a simple deployment with a single deployment element:

 

 

Scalability and Continuous Availability


Storage Spaces and Failover Clustering can scale along with business needs through the addition of multiple deployment elements of homogeneous or heterogeneous configurations, up to a total of 64 nodes, or 16 4-node deployment elements. Highly available data access continues to be unified through the CSV namespace, and workloads can be expanded across deployment elements.

 

Thin Provisioning


Consolidation of multiple workloads into a single server often results in multiple datasets sharing the same storage hardware. Storage Spaces delivers thin provisioning to enable enterprises to easily share storage capacity among multiple unrelated datasets, all without depending on perfect forecasts of their ultimate storage capacity requirements. Thin provisioning ultimately allows administrators to have straight-forward storage planning, focusing on the minimum number of physical drives to deploy and the near-term size of their datasets. Thin provisioning is compatible only with standalone system deployments.

Cost Saving

Thin provisioning affords administrators the opportunity to provision more capacity than initially available physical storage. With the steady decline in storage hardware prices, administrators can add on more physical disks to their pool in the future at reduced costs.  

Flexibility Within Pools

Storage Spaces allows administrators to create spaces which use fixed and thin provisioning in the same pool. As drive warranties expire, drives fail, and dataset grow, administrators can remove and add drives to a pool.  Administrators can grow their storage spaces, regardless of whether they use thin or fixed provisioning, as their datasets grow and as they add drives to the pool.

Efficient Storage Capacity Consumption

In order to provide easy sharing of storage capacity, each thinly provisioned space only consumes the capacity it actually requires to host data. This means on-demand allocation of capacity. Much of the functionality of Storage Spaces naturally leads to highly efficient consumption of storage capacity as datasets grow.  However, even while datasets generally tend to grow, datasets can experience intermittent periods of shrinking and some can even permanently shrink. On a regular basis, Windows optimizes the layout of files within each thinly provisioned storage space to reduce the amount of pool capacity they consume.

Threshold Notification

With Storage Spaces, capacity exhaustion is an avoidable condition. Administrators are given capacity consumption notifications at a configurable threshold, which, by default, is 70%, before the entire pool capacity is consumed. This allows administrators to add drives in advance of the pool’s capacity becoming fully consumed.

 

Administrative Ease And Simplicity


Optimized And Flexible Capacity Utilization

Storage Spaces provides on-demand provisioning with automated capacity reclamation, using the trim and unmap commands. This includes capacity reclamation for deleted files (NTFS and ReFS), sub-file capacity reclamation (e.g. through Hyper-V) and scheduled optimization for slab reclamation. These features help ensure that space is not wasted and data is stored in an optimized and effective method.

Storage Spaces provides elastic capacity expansion. To increase capacity, all the admin has to do is to add more drives. The diagram below illustrates this.



 

Operational Simplicity


Storage Spaces can be easily configured and managed through a variety of mechanisms including:  a control panel applet in Windows 8, Server Manager’s File and Storage Services in Windows Server 2012, Failover Clustering Manager to manage a highly available storage pool, and PowerShell – a command-line shell that is available in both Windows 8 and Windows Server 2012. 

The pooling model used in Storage Spaces enables flexible and granular administrative control. Administration can be defined and delegated on a per-pool basis. This allows for different administrators to be assigned to different pools and different spaces. Storage Spaces fully integrates with Active Directory and the Windows security model, providing a familiar model and tools for administration.


Scalable Pools and Spaces


One of the central benefits of aggregating physical drives into storage pools is to reduce the number of objects administrators must manage.  Storage Spaces is able to scale up to aggregate many physical drives and remain flexible enough for administrators who only desire a few pools. The same applies to the spaces that are created on the pools. Administrators can choose to have as few as many or as few spaces as they desire.

 

Effective Utilization of Hardware Investment


Storage Spaces enables effective utilization of hardware investment. Storage Spaces supports mixed deployments, so it is possible to utilize both SSD and HDD media types and control placement at storage space granularity.

 

Conclusion


Storage spaces significantly reduce the cost of available, reliable, and scalable storage because it does not require a storage network and external storage subsystem. It is designed to work with cost efficient, industry standard commodity hardware. It is easily scalable, highly resilient, highly available and easy for administrators to manage. Storage Spaces integrates with other capabilities that are part of Windows Server 2012 to provide an incredibly cost-effective storage platform, at scale, and satisfying performance needs of mission-critical deployments.
Sort by: Published Date | Most Recent | Most Useful
Comments
  • Under title Administrative Ease and Simplicity and subheading Optimized and Flexible Capacity Utilization there are two diagrams...appears the bottom diagram should be labeled 48 TB across 96 disks, not 36 TB across 72 drives which is the correct label for the top diagram BEFORE the new 24 drive enclosure is added below.  Admittedly this is a nit but threw me off as currently both labeled the same.

  • Does this provide the ability to assign per-VM service levels (e.g. Gold, Silver, Bronze)? If using both SSDs and magnetic disks, can you configure the SSDs for any type of read/write caching that can be set at the VM level? How about the ability to do "sub-space" data tiering, meaning moving hot spot chucks of data to SSD but not the entire VHDX file?

  • @vDerekS: If you want to assign a VM a certain level of storage you create a Storage Pool->Storage Space->Virtual Disk->Volume for each level of storage and then store the VM on the appropriate Volume.

    Write Back Cache is automatically enabled for any Storage Pool with SSD in it. You can turn it off but it can, as far as I know, not be configured in detail though.

    The auto-tiering works with 1 MB large chunks so unless you manually assign a file to either the SSD tier or the HDD tier only the active data rounded up to the nearest 1 MB chunk will be on SSD.

Page 1 of 1 (3 items)