With increasing numbers of organizations looking to create cloud-based environments or to implement cloud technologies within their existing data centers, business and technology decision-makers are looking closely at the possibilities and practicalities that these changes involve. Evidence of this growth of interest in the cloud is shown by organizations such as Gartner, who in their 2011 poll of Chief Information Officers identified cloud computing as the top technology priority.

Although the increase in business agility coupled with greater flexibility of service provision are convincing arguments in favor of moving to the private and hybrid cloud models, significant deployment blockers remain. In consequence, progress towards implementing private cloud technologies will not advance without organizations and IT departments showing how they can address these concerns effectively.

This article is a blueprint for the design of a Private Cloud Infrastructure as a Service (IaaS) layer within an Enterprise IT organization to provide cloud like capabilities that enable the deployment, operations and management of a private cloud infrastructure. This infrastructure is grounded upon the Reference Architecture for Private Cloud that together with the Infrastructure as a Service design represent the overall reference architecture for a Private Cloud that support higher layers of cloud services and workloads deployed and operated in Platform as a Service (PaaS) and Software as a Service (SaaS) deployment models. Thus the Infrastructure as a Service layer is the foundation for the design of IT service capabilities that are expected to exhibit cloud like characteristics.

This document is part of a collection of documents that comprise the Reference Architecture for Private Cloud document set. The Reference Architecture for Private Cloud documentation is a community collaboration project. Please feel free to edit this document to improve its quality. If you would like to be recognized for your work on improving this article, please include your name and any contact information you wish to share at the bottom of this page.

This article is no longer being updated by the Microsoft team that originally published it.  It remains online for the community to update, if desired.  Current documents from Microsoft that help you plan for cloud solutions with Microsoft products are found at the TechNet Library Solutions or Cloud and Datacenter Solutions pages.

The blueprint serves as the starting point of a series for IT Architects to gain the knowledge to apply the Private Cloud Reference Model, associated Private Cloud Principles, Patterns, and Concepts and planning guides to design a solid infrastructure foundation for an organizations need to provide cloud like capabilities in a private cloud deployment either on-premise, hosted or combined in a hybrid cloud scenario.

Infrastructure as a Service Content

The Infrastructure as a Service content series consists of several articles that define what Infrastructure as a Service is, the scenarios that are enabled or addressed and the design characteristics for the platform fabric and management layers of the reference architecture. These articles are presented in order of preferred reading and so audiences that are seeing this series for the first time are encouraged to review the content in this order:
  1. Blueprint for Private Cloud Infrastructure as a Service (this article) 
  2. What is Infrastructure as a Service 
  3. Private Cloud Infrastructure as a Service Automation and Orchestration 
  4. Private Cloud Infrastructure as a Service Fabric Management 
  5. Private Cloud Infrastructure as a Service Monitoring 
  6. Private Cloud Infrastructure as a Service Self Service 
All of the articles listed above reference the Private Cloud Reference Model and Architecture and therefor readers should be familiar with the details of the reference architecture before proceeding with the design of the Infrastructure as a Service layer. When applying the reference architecture there are many instances where multiple patterns, concepts, and patterns collectively form the design basis for a desired outcome or result. In these cases and throughout the Infrastructure as a Service theme these articles will hyperlink back to the appropriate reference architecture artifacts and forward to the appropriate planning guides for cross-reference and easier navigation.

Reference Architecture on the TechNet Wiki and Forum

This Private Cloud Reference Architecture and Infrastructure as a Service theme have been authored and hosted on the Microsoft TechNet Wiki to allow for continuous feedback and improvement by Microsoft, customers and the community. While parts of the reference architecture are grounded in practices that have proven themselves over time and not expected to change dramatically, other areas are expected to be enhanced as the practices for establishing cloud like capabilities emerge and mature. The TechNet Wiki provides an excellent platform to allow the reference architecture and IaaS theme to continue to evolve with complete transparency to customers and the community.

Further the TechNet Private Cloud Architecture Forums provide a platform for community discussion and the exchange of new ideas that advance the state of the reference architecture as new capabilities and experiences emerge in the industry. Throughout the reference architecture hyperlinks are provided into the TechNet Forums to facilitate this discussion. When agreement and proven patterns emerge Microsoft is committed to advancing those patterns into the reference architecture.

The Infrastructure Layer

The Infrastructure Layer is a component of the Private Cloud Reference Model and illustrated in this figure:

Figure 1: Infrastructure Layer

The Infrastructure Layer components include:
  • Network: Network services provide addressing and packet delivery for the provider’s physical infrastructure and the consumer’s VMs. Network capability includes physical and virtual network switches, routers, firewalls, and Virtual Local Area Network (VLAN).
  • Compute: Compute services supply the physical resources such as CPU, Random Access Memory (RAM), NIC, Video, and Storage used by the provider to deliver VMs to consumers. It contains the physical server hardware and parent OS.
  • Storage: Storage provides physical storage devices to the provider, which exposes these services to consumers as virtual disks. Storage should be connected to a network for VM portability.
  • Hypervisor: The hypervisor provides VM services by partitioning and presenting compute, network, and storage services.
  • Facilities: Facilities represent the physical building, racks, power, cooling, and physical interconnects.

These components represent the services provided by the Infrastructure Layer that must exhibit cloud like characteristics in a service oriented manner that are consistent with the guiding principles of the Private Cloud Reference Architecture. Collectively these services form the fabric of the Private Cloud Infrastructure that must be deployed, operated and managed in a holistic manner to provide cloud like characteristics expected from each component of the infrastructure.

What are these expected cloud like characteristics?

While there are many definitions for cloud computing, one of the more concise and widely recognized definitions is provided by the U.S. National Institute of Standards and Technology (NIST). The NIST definition defines the essential characteristics, service models and deployment models that must be present in any cloud computing platform. However to transparently provide these essential characteristics in a service oriented manner we must also include an extensive end-to-end automation and orchestration capability within the Infrastructure Layer to deploy, operate and manage the services provided in a responsive, secure and repeatable manner.

What is Infrastructure as a Service?

What is Infrastructure as a Service defines IaaS by coupling the NIST definition with a set of Private Cloud design Principles, Patterns and Concepts that together provide the grounding for a solid fabric design delivering Private Cloud Infrastructure as a Service. The architecture introduces design patterns such as Scale UnitsPhysical Fault Domains and Upgrade Domains to compose the compute, storage and network Resource Pools in a Holistic Approach to Availability Design that provide their respective services within the Infrastructure Layer.

Infrastructure Scenarios

Once the basic foundation for infrastructure capabilities provided by the IaaS design are defined next we can examine the scenarios that are addressed or become enabled in an organization deploying a private cloud. These scenarios define the problem domain for Private Cloud Infrastructure as a Service. The scenarios covered in the IaaS theme have been categorized into three areas:
  • Business Drivers: These are scenarios that arise through the normal execution of business needs placing demands in IT. These generally involve the need to be more agile and responsive to business needs and changes in demand.
  • Operational Drivers: Operational scenarios emerge from the need to provide secure and repeatable processes and procedures in the deployment, operation and management of the infrastructure providing services to the enterprise business.
  • Technical Drivers: Technical scenarios are those that emerge from needs or constraints placed on the infrastructure to conform to platform or application development requirements.

There may be scenarios that span multiple categories such as regulatory and compliance requirements that place demands on each of the areas however when completing a Infrastructure as a Service design those scenarios that span categories may be refactored by the architect into unique scenario drivers appropriate to each category.

Automation and Orchestration

The fabric of a private cloud consists of physical resources that include servers, storage and networking that collectively compose the Infrastructure Layer. These resources along with supporting facilities such as power infrastructure, cooling plants, and building environmental controls also must be considered in the overall fabric management design.

Taking a closer look at each of these resources we find that server resources fall into several roles. These are:

  • The servers that host traditional IT service capabilities in non-virtualized environments. While technically possible to implement a private cloud IaaS layer in a non-virtualized environment it is much more likely a private cloud would be designed with a high degree of virtualization. In a hybrid cloud scenario it is likely during integration that an existing service is identified as a required component of the infrastructure and therefore must be included in the fabric design.
  • The servers that host virtual machine instances throughout the lifecycle that includes development, test, staging and production.
  • The virtual machine server instances that form Pooled Compute Resources in the infrastructure.
  • The servers that host the management infrastructure to deploy, operate and manage the virtual machine hosts. It is increasingly common to design for virtualizing the management capability within the infrastructure.

Similarly the storage and networking resources are provisioned to support the servers in each of the roles.

In traditional IT servers may have been racked and cabled into an enterprise storage and network infrastructure. The software components would be installed either manually or published through software deployment technologies and tools. Eventually a large part of the overall deployment may be scripted providing a degree of repeatability and automation. As the industry and enterprise progressed into highly Virtualized Infrastructure environments the need to create the host infrastructure still exists and many scripted processes move forward and integrated into virtual machine image authoring. As hardware platforms progressed from standalone servers to blade configurations with integrated communications hardware greater automation of the environment could be realized.

The Private Cloud Reference Architecture defines several Principles, Patterns, and Concepts that provide A Perception of Infinite Capacity and Continuous Service Availability while Optimizing Resource Usage with Minimal Human Involvement. This can only be achieved in a highly virtualized environment with end-to-end service management automation and orchestration.

Automation and Orchestration is covered in the article Private Cloud Infrastructure as a Service Automation and Orchestration.

Fabric Management

In the previous section we discussed the need for Automation and Orchestration to enable the desired characteristics of a private cloud. Before we perform automation we first must understand a wide range of concepts including what we are automating, how communication is performed between elements of the infrastructure, the events posted by the infrastructure, where events are posted and the available management interfaces to act on the infrastructure.

The Private Cloud Reference Architecture defines some key principles to be considered in designing fabric management. These are:

Fabric Management involves encoding the desired aspects into a service template. These service templates are used to automate the overall deployment, operation and monitoring of the services that are hosted in a private cloud. This fabric management typically also includes the monitoring of the management layer and facilities.

Fabric Management is covered in the article Private Cloud Infrastructure as a Service Fabric Management.

Self Service

Self Service capability is a characteristic of private cloud computing and must be present in any implementation. The intent is to permit users access to a self-service capability and be presented with options available for provisioning in an organization. The capability may be basic where only provisioning of virtual machine with a pre-defined configuration or may be more advanced allowing configuration options to the base configuration and leading up to a platform capability or service.

Self service capability is a critical business driver that enables members of an organization to become more agile in responding to business needs with IT capabilities to meet those needs in a manner that aligns and conforms with internal business IT requirements and governance.

This means the interface between IT and the business are abstracted to simple, well defined and approved set of service options that are presented as a menu in a portal or available from the command line. The business selects these services from the catalog, begins the provisioning process and notified upon completions, the business is then only charged for what they actually use.

This is analogous to capability available on Public Cloud platform. See Private Cloud Infrastructure as a Service Self Service.

Private Cloud Infrastructure as a Service Problem Domain

Infrastructure as a Service as the name implies refers to providing IT computing infrastructure capabilities in a service oriented manner. Therefor in the context of Infrastructure as a Service the service is the set of capabilities provided by the IT organization that are governed by proven practices and monitoring throughout the lifecycle of the service. There are two perspectives to Infrastructure as a Service, the consumer perspective and the service provider perspective. Each perspective has their perception and expectation about how the service should perform and be delivered. These expectations are clearly defined in a mutual exchange between the provider and consumer and take the form of a service contract. Meeting the requirements of the service contract for both the consumer and providers perspectives and referred to as Service Management.

Service Management seeks to balance the expectations of the consumer of IT services with the expectations of the IT organization to deliver those services in a repeatable, secure and reliable manner consistent with industry best practices and business compliance requirements.

The Service Management processes and tooling guide us in defining the problem domain for Private Cloud Infrastructure as a Service. The Service Management processes are illustrated in the Private Cloud Reference Model along two layers: The Service Operations and Service Delivery Layers. Both of these layers represent business or operational processes that define requirements that the Management Layer must implement via staffing or systematically. In a private cloud the goal must be to achieve a very high degree in automating management operations associated with delivering a service. This Service Management Automation becomes the essence of the Infrastructure as a Service Problem Domain Definition.

Business Drivers

Self Service

Business owners are seen as needing the ability to quickly respond to change in business need and find themselves looking for alternatives when resources or time constraints provided by IT are inadequate to meeting the need. Self Service is a common capability of a private cloud and permits business owners consume IT resources that have been designed by IT to comply with business or regulatory requirements. Self Service enable business agility and control while manintaing IT compliance and monitoring.

Operational Drivers

Low Resource Utilization

This scenario can arise for one or more reasons including a traditional IT model of physical deployment of resources in a vertical or silo manner to meet the needs of each individual application or workload requirement. This can also result from a lack of or underutilization of management and monitoring tools to properly access the utilization of enterprise resources and may be linked to the capability in the overall IT maturity model.

Cloud computing and specifically Infrastructure as a Service designs that utilize the Private Cloud Reference Model are designed using several principles that include Taking a Service Providers Approach, this principle positions the business as customers of IT and IT as a service provider to the business. This relationship requires the IT organization to have a very high level of maturity. This maturity leads to a discipline of understanding the business and trends over time so that accurate assessment of need occurs in time to meet it. This combined with the principle of Optimization of Resource Usage lead to cost reduction and optimum utilization of resources though sharing of appropriate resources across the enterprise. This is commonly more accepted in private cloud deployments since all corporate data remains in-house. Hybrid private clouds also benefit since the data that cannot be hosted also remains in-house.

The principle of Incentivize Desired Behavior also encourages consumers of IT resource to only use what they actually need.

And finally the fabric management requirements of a private cloud require that IT processes he highly automated and monitored providing the fabric and IT exposure to unutilized resources and an automated ability to re-provision workloads accordingly.

Automate Common and Repeatable Tasks

This is a continual ask and evolution of IT. The IT organization wishes to continually improve the ability to deliver IT in a consistent, secure and repeatable manner. The principle of Achieve Business Value through Continued Measured Continual Improvement is core to every private cloud. Inherently private clouds today are likely deployed on current generation technology and using the latest fabric management tools to manage the infrastructure. These tools provide powerful design and validation capabilities to enable the user to improve process and automation workflows as business and technology needs evolve. 

Improve Staff to Resource Ratio

This is also a common ask of IT to essentially do more with less staff. This is sometimes seen as a threat to IT however it should be seen as an opportunity for both IT and the business. A private cloud Infrastructure as a Service deployment is highly automated therefore allowing IT staff to focus on process maturity improvement and authoring of new or additional capabilities on the private cloud. Those IT professionals that level up to possess instrumentation, scripting and automation skills necessary in a private cloud environment benefit the overall maturity of the enterprise and value to the business.

Path to PaaS and SaaS

Infrastructure as a Service is the foundation for supporting Platform as a Server and Software as a Service workloads deployed on a private, hosted or hybrid cloud. The same Management Fabric and associated process and discipline apply to workloads deployed into higher layers of the private cloud.

Self Service

IT Operations sees self-service as a means to enable a well-defined set of IT capabilities to business owners while maintaining the business or regulatory governance required to minimize risk to IT and the business. This also frees IT staff to focus on other aspects of the business in addition to new opportunities.