In this article we discuss the physical hardware platform for hosting a Private Cloud Infrastructure as a Service capability within an organization. The physical hardware platform is the environment that the Private Cloud Management and Operations Layers will operate upon to orchestrate the provisioning and management of resource instances provided by the platform.

The physical platform includes server, storage and network hardware provided by a wide range of vendors and this content series presents hardware capability deemed required to enable cloud like characteristics defined in industry models and the Private Cloud Reference Architecture.

The hosting platform also includes concerns that relate to the facility supporting the Infrastructure as a Service platform. These concerns include the geographic location of primary and backup datacenters and the environmental layout of the facility including power, cooling and airflow distribution within the facility. The layout of hardware within the facility to gain maximum density, utilization and efficiency of available resources in a manner consistent with so called Green IT operational practices is also a consideration.

The datacenter and facility concerns influence the physical platform since to have a Private Cloud you must have a highly mature management orchestration capability to deploy, manage and operate private cloud resource instances provided by the platform. This orchestration extends down to the lowest level of physical hardware management. This is referred to as Bare-Metal Provisioning of the Private Cloud Infrastructure as a Service platform. This bare-metal provisioning coupled with desires to operate datacenters at maximum efficiency require that the physical platform components expose management and monitoring interfaces that allow integration into the Private Cloud orchestration. The orchestration can then take environmental data into account when provisioning new resources or migrating existing resources.

This document is part of a collection of documents that comprise the Reference Architecture for Private Cloud document set. The Reference Architecture for Private Cloud documentation is a community collaboration project. Please feel free to edit this document to improve its quality. If you would like to be recognized for your work on improving this article, please include your name and any contact information you wish to share at the bottom of this page.

This article is no longer being updated by the Microsoft team that originally published it.  It remains online for the community to update, if desired.  Current documents from Microsoft that help you plan for cloud solutions with Microsoft products are found at the TechNet Library Solutions or Cloud and Datacenter Solutions pages.

1 Establishing a Scale Unit

The acquisition and built out of resource capacity in a private cloud is done in chunks called a Scale Unit. The Private Cloud Reference Architecture defines a scale unit as standardized unit of capacity that is added to a datacenter. There are two types of Scale Unit; a Compute Scale Unit which includes servers and network, and a Storage Scale Unit which includes storage components. Scale Units increase capacity in a predictable, consistent way, allow standardized designs, and enable capacity modeling.

Establishing a scale unit for a private cloud is based upon business, technical and operational criteria that is specific to the organization designing their private cloud. While there is no hard and fast definition of a scale unit that fits all scenarios, there are common characteristics that architects and implementers should consider in defining their organizations definition of a scale unit. Additionally a scale unit definition represents a point in time and will evolve as business needs change, advances into hardware technology and density occur, and as workloads evolve over time.

1.1 Business Characteristics

The creation and use of cloud computing within IT organizations are driving the business to consider costs associated with delivering IT as operational costs rather than capital expenditures. This is true to a lesser extent for private clouds since the initial costs to create a private cloud infrastructure are capital expenditures to the IT organization however the chargeback model accounting to each of the businesses consuming IT resources are treated as operational expense associated with providing IT services.

IT organizations in the context of a private cloud are still governed by business buying cycles where capitol hardware acquisitions occur throughout the fiscal year. These buying cycles can influence the amount of anticipated capability required to meeting demand during the course of the cycle and therefore the scale unit will anticipate the combined workloads needs of the organization for the period of a hardware acquisition cycle.

These hardware acquisition cycles may be closely linked with vendor or supplier agreements that include provisions for volume pricing or other discounts that influence the selection of a scale unit.

1.2 Technical Characteristics

The technical characteristics associated with scale unit determination are driven by the needs of the workloads hosted on the platform. Providing Infrastructure as a Service is all about the platform providing compute, storage and network resources is a manner consistent with business and IT service management requirements. Service management will include a catalog of resources that are available to the architect of a workload.

The compute resource catalog will include several options for compute resources that are classified by the capacity or capability provided by the resource. The capacity of compute resources includes the available memory, number of processor cores and the I/O bandwidth provided by the resource. The catalog will classify definitions for compute resources with descriptive names such as Extra-Small through Large. These classifications may be further defined in the organization as:
  • Extra-Small – These compute resources may be used for development sandbox environments for use by developers and testers performing application development for the private cloud.
  • Small – Small compute resources may be used throughout the deployment lifecycle from staging through production to host workload instances such as individual applications or instances that compose a farm of compute resources that make up a web front end layer.
  • Medium – Compute resources in this classification include business logic layer services or application workloads that place higher demands on the resource.
  • Large – Large compute resources are used for workload applications that have high in-memory footprints, large I/O requirements or large computational requirements. Typically these host the back-end data layer of an n-tier application development.

When orchestrating the provision or migration of these workloads the fabric may employ several methods to select an appropriate host to accommodate the resource instance. One such method may be intelligent placement based upon host utilization at the time the selection is made. Utilization measurements include processor utilization, memory utilization and I/O utilization or may be an overall aggregation of these values. Another method may be to pre-configure resource classifications by group and associate the group with host resources. This would cause all Extra-Small instances to be hosted by one group of hosts, all Small instances to be hosted by another and so on.

Therefor the selection of a scale unit based on technical characteristics will evaluate the anticipated workload demands on the infrastructure and the resource classification mix each workload requires of the infrastructure.

1.3 Operational Characteristics

The operational characteristics that influence the selection of a scale unit are driven by the technology innovations by hardware vendors to grow resource density while reducing the physical footprint. High unit densities in smaller areas require new and innovative techniques to package and integrate resources. Modern server configurations include a chassis supporting multiple compute blades with storage and networking connections integrated into the backplane.

Including these high density chassis into traditional datacenter racks and deployment of racks within the datacenter have driven new designs for how power and cooling are implemented including the creation of cold and hot rows in the datacenter to provide efficient and effective airflow distribution and exhaust.

These physical characteristics and deployment methods may influence scale unit selection since the built out and reconfiguration of datacenter environmental infrastructure maybe be prohibitive or impractical.

Another operational characteristic involves the platform vendors approach to exposing holistic management interfaces for performing low-level operations on the hardware platform that include:

  • Query Hardware Configuration
  • Query Hardware State
  • Update of Firmware
  • Remote Console Access
  • Remote Power Management
  • Environmental Conditions
  • Fault Conditions and Diagnostic Codes

Extending the Fabric Management down to the lowest levels of the physical platform provisioning enable a private cloud to respond elastically by dynamically bringing host resources online and offline as needed. This can only be accomplished if the host platform supports a rich administration capability that can be integrated into private cloud management and operations.

Therefor the management capabilities provided by the host platform and the physical characteristics influence the selection of the private cloud scale unit at a given point in time and is influenced by the design and operational aspects of the datacenter.

The selection of scale unit configuration and size are left to the IT architect and implementer to decide based on a combination of business, technical and operational drivers. There is no one size that fits all. A scale unit may be as small as one integrated blade chassis configured with redundancy to implement a physical fault domain or large container of racks and resources that implement multiple physical fault domains.

2 Server Hosts

The server hosts that make up the physical hardware of a Private Cloud Infrastructure as a Service platform have two primary functions. The first is to physically host workload applications that are incorporated into services hosted on the private cloud. While this is technically possible on modern hardware, realistically workloads are more likely to be hosted in a virtualized environment in the private cloud. That said this scenario is also used to host workloads that have some constraint, restriction or other limitation that prevents the workload from being virtualized.

The second and most common use for the physical servers is running the core operating system platform that hosts virtual machine instance provisioned by the Management and Operations Layers of the Private Cloud. The primary goal of Fabric Management is to configure and provision virtual machine instances in response to a service management request, those virtual machine instances must be hosted somewhere in the private cloud infrastructure. That somewhere is the virtual machine hosts.

Ideally in the most mature organizations the IT processes for configuration, management and operations of virtual machine hosts is encoded into automated scripts and workflows that can be integrated into the private cloud fabric management.

This is referred to as bare-metal provisioning of physical host resources. Consider the notion of capacity in the private cloud. Capacity is defined by the ability of fabric management to compose compute resources into pools classified by capability. These pools are allocated into resources that make up a service and assigned into an upgrade domain for service deployment. Capacity is limited by the number of resource instances that can be hosted by the physical hosts in the private cloud infrastructure.

To dynamically grow (or collapse) capacity to meet demands implies an ability to add or remove physical host resources to the fabric. Minimally these resources can be configured and brought online manually and made available to the fabric. However to reach the maturity level expected for a private cloud, physical host resource provisioning must be integrated into the private cloud Management and Operation Layer Fabric Management.

This dynamic capability allows a private cloud to respond to a service request for more resources than can be accommodated by the host capacity currently online. To accommodate the service request fabric management will initiate an automated workflow process to perform bare-metal provisioning of a new host and configure the resource into fabric management thereby increasing the overall capacity of the private cloud. The process of dynamically growing the capacity of the private cloud is transparent to the user or service owner such that capacity of the private cloud appears infinite.

We already touched on some of the host server characteristics required to realize this level of IT maturity and service management. In the next section we’ll expand on the host server architecture in more detail.

2.1 Host Server Architecture

The host server architecture is a critical component of the private cloud infrastructure, as well as a key variable in the consolidation ratio and cost analysis. The ability of the host server to handle the workload of a large number of consolidation candidates increases the consolidation ratio and helps provide the desired cost benefit. This section outlines the major host server architecture considerations and Section 7 provides three host server reference architectures.

Microsoft does not recommend specific server vendors or models in this section because there are many excellent candidates. Microsoft does recommend finding the best cost/performance ratio in the host server choice by focusing on mainstream Intel or AMD servers toward the high end of the commodity server scale. The “sweet spot” is typically in the two- to four-socket servers running the highest or second highest CPU speed with multi-core processors.

2.2 System Architecture

The system architecture of the host server refers to the general category of the server hardware itself. Examples include rack mounted servers, blade servers, and large symmetric multiprocessor servers (SMP). The primary tenet to consider when selecting system architectures is that each Virtual Server host will contain multiple guests with multiple workloads. Processor, RAM, Storage, and Network capacity are critical, as well as high I/O and low latency. It is critical to ensure that the host server is able to provide the required capacity in each of these categories.

2.2.1 Standard Rack Mount Servers

The most common system architecture is a standard rack mounted server. Typically found in 2U or 4U models, these servers typically contain 2 to 4 CPU sockets, 2 to 8 PCI-E or PCI-X slots, and 4 to 6 hard disk bays. Rack mounted servers are excellent choices for Hyper-V hosts due the low cost of commodity 2- and 4-socket servers and their inherent scalability and expandability through additional network adapter and HBA slots.

2.2.2 Blade Chassis and Servers

Blade servers have dramatically increased in popularity and capability due to the ever increasing need for capacity and server density. Blade server architectures are a primary area of R&D for the server manufacturers, resulting in a significant amount of innovation in this space. The downside of blade servers is limited standards and interoperability between manufacturers and, in some cases, within the same manufacturer when they change their blade chassis architecture.

The processor and space density benefits of blade servers initially came at the expense of expandability and the quantity of network adapters and HBAs that can be supported in the first several generations of blade servers.

Recently, the advent of blade architectures where each blade contains 8 to 16 cores, up to 64 GB of RAM, and most importantly, 6 or more IO interfaces, has eliminated many disadvantages that previously mitigated against using blade server architectures for virtualization.

The network and storage I/O that is required to support the desired number of guests on each host server must be considered carefully to ensure that each host server is running on a blade and the blade chassis itself can support it.

The host server architecture must be considered when evaluating blade server system architectures. For the Two-Node and Server Farm patterns, dedicated network adapters are required for the cluster heartbeat network. If an iSCSI storage system is planned, two additional dedicated network adapters are required for access to the storage and redundancy. Finally, at least two network adapters should be dedicated to network I/O. The number of network adapters required per host can easily expand from 4 to 6 or more network adapters. This is frequently beyond the number supported by many blade servers.

2.2.3 Large SMP Servers

For the purposes of this document, large SMP servers are defined as those that have 8 or more CPU sockets. At the very high end, Windows Server 2008 R2 Datacenter Edition on 64-bit hardware can support servers with up to 64 CPU sockets and 2 TB of RAM. Many of these very high end servers include advanced features such as hardware partitioning, hot-add of resources, hot spare components, and so on. Hardware of this capacity has the potential to host hundreds of virtual machine guests.

A large SMP server or large SMP server cluster may be appropriate if a very large number of servers will be consolidated, if the organization has operational experience with large “mainframe” class servers, or if they have already standardized on large SMP hardware. 

2.3 Processor Architecture

Windows Server 2008 R2 with Hyper-V requires an x64 processor architecture from Intel or AMD, as well as support for hardware execute disable and hardware virtualization such as Intel Virtualization Technology or AMD Virtualization.

Both Intel and AMD provide a wide range of processors that are appropriate for host servers. The industry competition between the two is very tight and at any one time; one may have a performance advantage over the other. Regardless of which manufacturer is chosen, several performance characteristics are important.
The number of processor cores is a key performance characteristic. Windows Server 2008 R2 with Hyper-V makes excellent use of multi-core processors, so the more cores the better. Another important characteristic is the processor clock speed, which is the speed at which all cores in the processor will operate. It’s important because it will be the clock speed of all of the guest virtual machines. This is a key variable in the consolidation ratio because it impacts the amount of candidates that the host server can handle AND the speed at which those guests will operate. As an example, choosing 2 GHz processor rather than a 3 GHz processor on a server that will host 20 guests means that all of those guests will run only at 2 GHz.

At a lower level of detail, the server processor architectures make design choices in terms of the type and quantity of processor cache, memory controller architecture, and bus/transport architecture. A detailed analysis of these factors is beyond the scope of this document.

2.4 Memory Architecture

Once the system architecture and processor architecture choices are made, there are relatively few options remaining for memory architecture because it is usually predetermined by the manufacturer/system/processor combination. The memory architecture choices that remain are typically quantity, speed, and latency. For supporting virtualized workloads the most important memory architecture choice is the quantity of RAM.

The quantity of RAM is a more important factor than RAM speed or latency. Once the maximum amount of RAM that is cost effective is determined, if there is a remaining choice between speed and latency, choosing the memory with lower latency is recommended.

2.5 Network Architecture

The network architecture of the host server is a frequently overlooked topic in host server sizing because GB Ethernet network adapters are now very inexpensive and most servers have at least two built in. The topic is important, however, because it is directly impacted by the host server architecture pattern selected. If one of the two host server cluster patterns is selected, a dedicated network adapter per server is required for the cluster private (heartbeat) network. As mentioned previously, if an iSCSI storage architecture is being utilized, network adapters will need to be dedicated to storage I/O traffic. Gigabit Ethernet is a high-speed network transport, though a host server with a large number of guests may require greater than gigabit speed, thus requiring additional network adapters. Finally, it is recommended that each host server have a network adapter dedicated to the host itself for network I/O and management.

As described earlier, a fairly large number of network adapters per host server may be required. This is the one factor that can mitigate against blade servers in some instances. Recently, 10-GB Ethernet has become commonly available and is starting to drift lower in price, similar to the way gigabit Ethernet has done over the years. The ability for servers to utilize 10-GB Ethernet network adapters is a significant factor in increasing the consolidation ratio.

Use multiple network adapters and multi-port network adapters on each host server.

  • One network adapter dedicated to the host machine only for management purposes.
  • One network adapter dedicated to the private Cluster Heartbeat network.
  • One network adapter dedicated to the Live Migration network.
  • One or more network adapters dedicated to the guest virtual machines (use 10 Gbps network adapters for highest consolidation).
  • Two or more network adapters dedicated to iSCSI with Multipath I/O (MPIO).

Dedicate at least one network adapter/Port on each host server for guest virtual machine network I/O. For maximum consolidation ratio, utilize one or more 10-GB Ethernet network adapters to virtual machine network I/O.

2.6 Host Bus Adapter Architecture

The disk storage for all guest virtual machines is one or more VHD files housed on the storage system being utilized by the host server. Host storage I/O, in addition to the system, processor, memory, and network architectures described earlier, is the final major component of host server sizing. Hyper-V I/O consists of a large number of read and write IOPS to the storage system due to the large number of guests running on each server and their various workloads.

If direct attached storage is being utilized, a SATA II or SAS RAID controller internal to the server is recommended as described earlier. If a storage array and SAN are being utilized, host bus adapters (HBAs) are required in the host server. The HBA provides access to the storage array from the host server. The storage connectivity is a critical component for high availability and performance.