Introduction

Cortana Intelligence Suite (formerly known as Cortana Analytics Suite) is a fully managed Big Data and Advanced Analytics suite to transform data into intelligent action. It is a collection of technologies, all part of the Microsoft Azure cloud offering. This article will provide an insight into this data science technology.

Business Intelligence - Evolved

Traditional Business Intelligence is all about descriptive and diagnostic analytics. Typical questions include

  • What happened?
  • Why did it happen?
Advanced analytics goes beyond this, and focuses on predictive and prescriptive analytics. Typical questions include
  • What will happen?
  • What should I do?


This is more complex, but also provides greater value. On a high level, this enables businesses to evolve in three ways
  • Reactive → Proactive
  • Historical Data → Predictions
  • People and Processes → Decision Automation

Overview

The Microsoft solution to advanced analytics and transformation of data into intelligent action is Cortana Intelligence Suite. The image below illustrates how this fits into the overall process. Raw data can originate from several sources, including databases, apps, sensors, devices and IoT systems. Various transformations, analytics and machine learning processing can then provide insights, which is presented or delivered to people, apps and automated systems. These intelligence components depicted in the middle will be covered later in this article.

Architecture

Cortana Intelligence Suite is basically a marketing term for a bundle of integrated services. These services are provided as part of Microsoft's strategy to democratize AI. This involves making advanced analytics and big data processing tools readily available for users. The overall architecture is illustrated below.



 Background story

Cortana is a fictional synthetic intelligence character in the Halo novel and video game series. Cortana is "smart AI" which can learn an adapt, and was the inspiration for the Windows Digital Assistant. Initially Cortana was a codename, but got such strong response that Microsoft kept the name. Cortana symbolizes the contextual intelligence they hope to achieve in the suite.

Components

The following sections will explain the key pillars and components of the technology suite. The components consist of IaaS, PaaS and SaaS offerings. Note that some may be in Preview, and/or not available in all regions. Verify this in the Azure Documentation.


Information Management

Orchestrate data movement on a fully managed, end-to-end platform. Use Azure Data Factory to build pipelines and collect and orchestrate data from the services you use for easier analysis. Plus, use Azure Data Catalog to effectively manage data sources and Azure Event Hubs to provide a staging area for incoming streaming data.

Data Factory

   The Data Factory is used to compose and orchestrate data services at scale.

Main purposes are to create, schedule, orchestrate and manage data pipelines, visualize data lineage, monitor health and automate cloud resource management. Both on-premises and cloud data sources can be utilized.

Most Data Factory elements are hand-written JSON scripts.

More details about Data Factory can be found here.

Data Catalog

   

The Data Catalog is an Enterprise-wide metadata catalog for data assets. It simplifies data source discovery via search.

Main purposes are to spend less time looking for data, and more time getting value from it, and bridge the gap between IT and the business, allowing everyone to contribute their insights.

Note that there is one Data Catalog per organization, and authentication only accepts an organizational account.

More details about Data Catalog can be found here.




Event Hubs

   The Event Hub is a cloud-scale telemetry ingestion service that collects, transforms and stores events. It is a publish-subscribe service that handles high volumes and high velocity data streams - with low latency.

Main purposes are to allow events to be ingested into Azure from many data sources, apps, platforms and devices. Another option is IoT Hubs. There is seamless integration available with other Azure services. Event Hubs is the preferred method of event ingestion for Stream Analytics.

More details about Event Hubs can be found here.

Big Data Stores

Store and manage structured data using Azure SQL Data Warehouse that elastically scales with massively parallel processing. Implement a hyper-scale repository with no file size limits for unstructured data using Azure Data Lake Store to attain massive throughput and analytic performance.

Data Lake Store

   The Data Lake Store is a cloud offering to store unstructured, semi-structured and structured data. It is built for the enterprise, with no restrictions on data size.

Main purposes are to store store big data, support Hadoop applications, and is optimized for data analytics. The technology is secure, scalable and HDFS compliant. Files can be of PetaByte size, and supports low latency scenarios.

More details about Data Lake Store can be found here.

SQL Data Warehouse

  
The SQL Data Warehouse is DWaaS optimized for performance and large scale, distributed workloads.

Main purpose is to provide DW in a cloud-based PaaS solution, with massive parallel processing power and scaling on-demand or schedule. It is built on SQL Server, and support relational and non-relational data. The data itself is stored in blog storage (not SQL DB).

More details about SQL Data Warehouse can be found here.

Machine Learning and Analytics

Design and publish predictive models with Azure Machine Learning, use Azure HDInsight to analyze data in Storm and Spark for Hadoop environments, integrate your code from R or Python, and analyze any kind or any size of data you need with Azure Data Lake Analytics and Azure Stream Analytics. Plus, use Microsoft Power BI to create rich visualizations that bring your data to life.

Machine Learning

  
Machine Learning is a service for building predictive models using statistical techniques. It can learn from existing data to forecast future behaviors, outcomes and trends. Several predefined algorithms and a drag-and-drop user interface lowers the bar for creating powerful data science models.

Main purposes are anomaly detection, clustering scenarios, multiclass classification and regression analysis. Typical use-cases include getting, cleaning, transforming and normalizing data, in addition to building, scoring and evaluating models using the mentioned techniques. Both Python and R can be used to extend models beyond the built-in functions. A finished model can eventually be published as an Azure Web Service.

More details about Machine Learning can be found here.

Data Lake Analytics

   Data Lake Analytics is an on-demand cloud analytics service where parallel data transformation and processing programs can be run using U-SQL, R, Python and .NET over Petabytes of data.

Main purpose is to process any data, regardless of size or structure, without worrying about VMs, servers or clusters. Processing power can easily be adjusted, so the focus can be on jobs rather than infrastructure. Data Lake Analytics is optimized to work with Data Lake Store.

More details about Data Lake Analytics can be found here.

HDInsight

 HDInsight is the only fully-managed cloud Hadoop offering that provides optimized open source analytic clusters for Spark, Hive, MapReduce, HBase, Storm, Kafka, and R Server backed by a 99.9% SLA

Main purpose is to serve as a cluster-as-a-service offering for distributed big data processing, scaling and querying capabilities. It runs on both Windows and Linux, and is based on Hortonworks Data Platform distribution. It integrates easily with other open source projects, and is extensible by using U-SQL.

More details about HDInsight can be found here.

Stream Analytics

 Stream Analytics is an analytic processing engine, and provides real-time stream processing in the cloud.

Main purposes are to gain real-time insights into data coming from IoT devices and other applications, and perform real-time analytics on data from Event Hubs.

This is scalable to any volume of data while still achieving high throughput, low-latency and guaranteed resiliency. Millions of events can be streamed per second, and processed with a SQL-like language.

More details about Stream Analytics can be found here.


Intelligence

Explore Cognitive Services and learn how to enable natural and contextual interactions in your apps. Integrate analytics services and models with Cortana, your personal digital assistant, to let users interact with your app through speech and receive proactive notifications. Build and connect intelligent bots that naturally interact with your users wherever they are—from SMS to Skype and Office 365—by using the Bot Framework.

Cognitive Services

   Cognitive Services are a set of APIs, SDKs and cloud services to build intelligent systems.

Main purposes are to make applications more personalized, intelligent and engaging by incorporating emotion recognition, facial detection, video intelligence, recommendations etc.

More details about Cognitive Services can be found here.

Bot Framework

   Bot Framework allows humans to interact with computer systems in a more humanly fashion.

The main purpose is to build and connect intelligent bots with end users, wherever they are and whatever platform they use. Supported platforms include custom websites and apps, Facebook Messenger, SMS, Skype and Slack.

More details about Bot Framework can be found here.

Cortana

   Cortana is the virtual personal assistant for asking questions, finding and managing things, and monitoring and alerts.

The main purpose is to connect users to systems and services across platforms and devices. This is typically performed using natural language. Cortana interacts with Bot Framework and application APIs to provide answers and results for the end user.

More details about Cortana can be found here.

Dashboards & Visualizations

Create dashboards and visualize results using Power BI.

Power BI

  

Power BI is a suite of business analytics tools to analyze data and share insights.

Main purposes are to collect data from various sources, extract relevant information, and present it using powerful dashboards and graphics. Details can then be further analyzed, and intelligent decisions made. Power BI can present both real-time data from Stream Analytics, and long-term data aggregated from years of data collection, for instance.

Power BI is available as a standalone application, web based interface, and mobile apps.

More details about Power BI can be found here


Other Components

In an end-to-end scenario, additional components and technologies will typically be used. This may include IoT Hubs, Blob Storage, DocumentDB, Azure SQL Database, Azure Virtual Machine, Excel, Azure Automation, Azure Active Directory, Express Route, Virtual Network and custom code using C#/R/Python etc. Many of these are described on the Microsoft Azure Site.

Sample Scenarios

This technology has a range of practical applications, and is already being used to solve complex analytics challenges around the world. The illustration below shows a few examples for some industries and areas.

Detailed samples and scenarios are available on the Cortana Intelligence Suite Industry Solutions site. The diagram below shows one sample for vehicle telemetry analytics. The architecture utilizes several components in the suite, and also shows the workflow and how it's all connected. The sample is described in more detail here.

Cortana Intelligence Gallery

To help customers and developers create intelligent solutions, there are numerous templates, APIs, experiments and much more available in the Cortana Intelligence Gallery. Both Microsoft and 3rd parties can publish content here, making it a valuable contribution community.

The image below shows one of the most popular Azure Machine Learning experiments - Telco Customer Churn. This and several others are readily available for download.

Conclusion

By democratizing AI, Microsoft wants to make this technology available for everyone. The goal is to make machines and humans work together to make the world a better place. Cortana currently has 133 million active users, and has processed over 13 billion queries/questions. The first building block around Machine Learning and Advanced Analytics is Cortana Intelligence Suite. The number of customers and industries using it is growing, enabling them to make better business decisions. We are only scratching the surface of what AI can accomplish. Although it is up to developers and IT pros to create intelligent solutions, Microsoft wants to empower every person and organization to create magical experiences and solve great challenges.

See Also

Another important place to find an extensive amount of Cortana Intelligence Suite related articles is the TechNet Wiki itself. The best entry point is Cortana Intelligence Suite Resources on the TechNet Wiki.