Introduction

"Master data is information that is key to the operation of a business. It is the primary focus of the Information Technology (IT) discipline of Master Data Management (MDM), and can include reference data. This key business information may include data about customers, products, employees, materials, suppliers, and the like. While it is often non-transactional in nature, it is not limited to non-transactional data, and often supports transactional processes and operations. For example, analysis and reporting is greatly dependent on an organization's master data. Because master data may not be stored and referenced centrally, but is often used by several functional groups and stored in different data systems across an organization, master data may be duplicated and inconsistent (and if so, inaccurate)." Wikipedia 

In this article, we will be talking about master data services in the cloud. We will be looking at cloud as the next frontier of business insight computing. This article is being  presented in the following order:

     1.   What challenges and problems can the combined  cloud computing and MDS solve?
     2.   How are we going to solve those problems?
     3.   Technical and achitectural overview
     4.   Installation and configuration of Master data services
     5.   Basic concept of Master data services
     6.   Master data services basic operation
     7.   Advantages of the new SQL server 2012 insightful computing and the cloud

Real Life Scenario

Fictional company "Westwind" is operating  business in 98 countries around the world with 50,000 employees and more
than 2800 enterprise client. The company  wants a consistent record of  employees, client, contacts and subcontractors
across  the entire enterprise. These data will be use by more than in 500 different departments in different delivery
centers around the globe. These department level user are known as application team or "app team" in short. An
example of this is the "mylearning" team - a team similar to Microsft's mylearning team. They are know as app team
because they build a specific application for the entire "westwind" enterprise.

"Westwind" has an existing data warehouse solution and they will make use of Master Data Service  and Data Quality
Service (DQS) for master data management. Reference tables, dimensions and look-ups such as employees, client,
and accounts will be supplied to department level application consumer as a "data as a service" 

Some department such as sales, finance and human resource supplies data to the central data warehouse. They are
referred to as "inbound" applications becuase they supply data to master database. While the data from these
departments are being loaded, the warehouse is almost unavailable to the user.  However, "Westwind" requires that
the data be available in the soonest possible time and they require  an uptime of 99.9999 percent.

"Westwind" requires that there should be one source of truth. In the current setting the only source of truth
is the data warehouse. However,  the data warehouse is use for reporting and application team that may need that data
are only allowed  to pull the data from the warehouse using  integration packages.  This will later results to 1000 copies of the employees
table alone and 1000 SSIS package being maintained and monitored  by the integration team. There are at least 7
reference table in the company which will be used by 500 different department. Each department having different
application. This scenario will result to deployment of integration team around the world for 24/7 operation coverage

Westwind would like to conserve hardisk space and insists to limit the number of copies of employee
tables, accounts, and other reference table.

Another problem that "Westwind" encounters is that when inbound application changes their metadata.
The change  should be coordinated to the outbound or downstream application. The downstream application
may or may not need the data. The real headache comes in  for example the 7 reference table changes their
metadata each reference with at least 3 fields and each 7 reference table having at least 500 to 1000 downsteam
application.

With the above metioned problems, the integration team also faces an extreme challenge commencing from 
requirements gathering, build deadlines, testing and port openings due to changes in the server or server migration.

Westwind are also facing problems with the environment aside from port opening. These are patching, VM backup
retention of 3 days only, harddisk capacity planning and expansion and high availability solution.

"Westwind" stakeholders are well aware of the cloud however they are extra paranoid about putting their data on
the public cloud. 

"Westwind" want to simplify all of these complexity and reduce their total cost of ownership.

Master Data Services Solution

Westwind implemented master data services to solve many of their problems above. Master data services provided
some sort of governance over the data that goes in to the dimension table. 

To resolve downtime issue, "Westwind"  initially implemented 3 MDS  database in the cloud. While the main MDS
server is being populated the rest of the MDS database  are kept alive supplying data to the enterprise. There are 3 MDS
implementation, one each for EMEA, APAC and ASIA. EMEA servers are populated during EMEA off peak hours and APAC
are replicated during APAC off peak  hours and so it is in ASIA. Synchronization of data in every region does not affect
each other. With cloud implementation "Westwind" business insight team are now empowered to scale-up their cloud
VM roles by adding compute power  or scale-out  master data implementation by provisioning new servers.

To resolve the problems on schema changes from the inbound applications, Westwind has eliminated all the reference
tables in the downstream application and forces all of them to use the MDS reference  table that is geographically nearest
to them. EMEA user for example uses the EMEA-MDS to minimize bandwidth.

To rapidly respond to changing inbound schema changes, Westwind consolidate this changes to a monthly release. To
facilitate faster testing westwind provisions new VMRoles for a particular release leaving the current MDS online. When
the release passes all the testing requirements The server are then tag as the new production servers and the current
servers are archieved and  decommisioned from the cloud. All applications using the MDS table repoints to the new
production server by changing their configuration.

While most of the work of the integration team concerning integration were all gone, the team now focusses on
integration that are not involving master data.  The problems with requirements gathering, port opening, metadata
familiarity and  sample data are still present. To resolve this problem "Westwind" business insight team a implemented
metadata store in which all  participating applications/department can query and explore. Application team are
required to connect their database to the metadata store before requesting any integration activity. This metadata
allows "Westwind" business insight team to explore enterprise database and architecture even before
an official integration request.

With the new "Westwind" cloud enabled BI. Most of the environment and configurations were gone. Cloud
environment delivery team can expand or shrink computing resources such as CPU, hard disk storage and test
environment provisioning were all done in matter of minutes.

"Westwind" stakeholder paranoia concerning their data on the cloud were secured by implementing some sort of
DMZ in the cloud. Their servers can be only accessed onsite. This has been implemented via the VMrole internal
and virtual IP.

Master Data Service and the Business Insight Architecture

Below is the westwind proposed architecture.



To gain more  understanding the new BI and Cloud architecture please take your time to watch the video below:
        Cloud-Ready Data Services

Configuring Master Data Service in the Cloud

To setup a master data service in the cloud you should create a windows server 2012 64 bit VMrole.
Master data services works only on 64 bit operating system.

STEP #1. Provision a "Windows Server 2012  Cloud VM role"

The first step  is to provision a cloud "VM role". For this step please refer to the first demo in the
Cloud-Ready Data Services video published above.

Step #2. Install MDS by launching the SQL Server installer and check on MDS.

Launch the MDS installer on the windows server 2012 VM role.

Step #3.  Verify that MDS is Installed.

  

figure 2.  Showing MDS tiles in Windows Server 2012

Step #4. Enable IIS

To be able to configure Master Data services you should configure IIS.
The following links show how to Configure IIS in Windows server 2012.
For a complete list of web application requirements please consult the link

Step #5. Configure Master Data service database

Once you have completed the IIS requirements you will then configure the MDS database.
Click on the MDS Icon to begin installation.



Step #6. Configure the MDS Website

Next step is to configure the MDS web site

 

Step #7. Connect the mds website to an MDS database

Step #8. Open the master Data service administration pages



Master Data Services Basic Concept

Before proceeding lets review some of the basic terminology of MDS.

A.  Model.

Models are the highest level of data organization in Master Data Services. A model defines the structure of
data in your master data management solution. A model contains the following objects:
Entities, Attributes
and attribute groups,
Explicit and derived hierarchies, Collections.

MDS model is pretty much like a database in Microsoft DBMS terminology.

B. Entity 

Entities are objects that are contained in Master Data Services models. Each entity contains members,
which are the rows of master data that you manage.

Entities is very much like a table in Microsoft RDBMS terminology.

C. Attributes

Attributes are objects that are contained in Master Data Services entities. Attribute values describe the members of the entity.
An attribute can be used to describe a leaf member, a consolidated member, or a collection.

Master Data Service Basic Operations

A. Managing Data Warehouse Dimensions with MDS

One of the main objective of MDS is to manage warehouse dimension. In these videos you will learn how to
manage warehouse dimension using MDS

Managing Data Warehouse Dimensions with MDS (part 1)

Managing Data Warehouse Dimensions with MDS (part 2)


B. Master Data management with Excel addin

One of the newest feature of MDS 2012 is to manage your data using MDS addin for excel.  This tool will allow you to
perform a lot of work such as designing your entity, doing picklist and managing entity relationship. For more
information on this feature you can consult the following videos below:

        Using MDS Excel Add-In to Model and Manage your Master Data

C.  Managing MDS with Web User Interface Explorer overview

Below is an overview of The MDS Web User Interface.
SQL 2012 Master Data Services - Web User Interface Explorer overview

D. Data Quality Service and MDS working together

This Technet video will give you an overview of DQS and MDS working together

DQS Integration with MDS: Data Matching using MDS

 

E. Using DQS and SSIS for building "MDS inbound" integration package

Although you can manage data entry with excel add-in you will be using  SSIS to "inbound" the data to MDS database
especially if you have an OLTP system as source.

In the video link below, you will be shown how SSIS, DQS and MDS integrate with each other.

        Enterprise Information Management (EIM): Bringing Together SSIS, DQS, and MDS