This topic describes the documentation currently being planned for the Microsoft Hadoop Distributions.  This is a work in progress.  Community participation and influence are encouraged to help meet the needs and expectations of the Apache Hadoop community working with Microsoft platforms or technologies.  So please feel free to contribute.  For instance, you can edit the list of Content Ideas to include descriptions of topics you'd like to see.  Or if you're already working on documentation that you'd like to make known to others, you can add it to one of the monthly preview lists that are included further below.


  • Identify what technical content will be produced.
  • Identify what content is lacking or misleading.
  • Identify timelimes and channels for content delivery.

Intended Audience

The desired audience of this content is anyone with expertise and experience with Apache Hadoop or the desire to obtain such expertise and experience.


Initial delivery of content will be to the TechNet Wiki, however we will also consider the MSDN Library,,, and other community sites for subsequent content publication.

TechNet Wiki

To support preview customers, we will create a landing page on the TechNet wiki, which shall be titled “Apache Hadoop on Windows”. This page will contain brief content sections on our distribution, but will primarily serve as an aggregator of links to resources on learning and working with Hadoop.


Note: The TechNet wiki creates the article URL based on the title of the article, replacing spaces and other whitespace characters with ‘-‘. Changing the title of an article results in the creation of additional URLs referencing the article based on the new title. Old titles are not broken by changes to the title, and can continue to be used.

Target Platforms

  • Hadoop Clusters on the Windows Azure platform.
  • Apache Hadoop cluster provisioned through the Elastic Map Reduce (EMR) Portal,
  • Hadoop clusters deployed to on-premise hardware.

Content Maintenance

The Content and Community team will accept bug requests and feedback through Additionally, the team agrees to triage comments posted against published content.


The current plan is to have initial content ready by 12/14 for initial preview customers, and add additional content incrementally. Once preview is underway, existing content will be updated based on customer. So please feel free to post constructive feedback for the community. Future content will also be created based on customer feedback.

By January we plan on having a set of how-to videos published.

By Febuary we plan on publishing/updating best practice documentation in collaboration with the Apache Software Foundation.

December – Preview

  • Documentation plan (this document)     
  • TechNet Wiki landing page
  • Changes to content on Elastic Map Reduce (EMR) portal page
  • Quickstart / Deployment guides
  • Hive Interactive Portal (video)
  • How to Use the Excel Hive Add-in to Acceess Hive on Windows Azure (video)
  • Interactive JS (video)
  • How to Connect Excel to Hadoop on Azure via HiveODBC
  • Accessing Hadoop Hive via PowerPivot


January – Preview

  • A set of how-to videos
    • How to Get SQOOP Data into and out of SQL Server
    • How to Use PowerPivot with Hadoop
    • How to Use Excel with Hadoop
  • Best practices
  • Updated quickstart guides based on feedback

Content Ideas

  • FAQs
  • Using Hadoop with JavaScript
  • Hive
  • Third-party products (TBD)
  • Five samples off the shelf (TBD)    
  • Compare a useful Hadoop job with same goals done "old school" (pre-Hadoop) to focus in on the differences and extra leverage from Hadoop
  • Show how Hadoop fits into my data management/processing ecosystem (what does it replace, what does it complement, what is it orthogonal to)


This document will contain frequently asked questions and how-to content, Apache Hadoop Based Services for Windows Azure How To and FAQ Guide.

Video Content

See the Video Plan document.