Computer Vision API


Azure's cloud-based Computer Vision API is easy and a lot of fun to use. This Wiki provides an intro to using the API and is a supplement to the great documentation already available on the Azure docs: Computer Vision Documentation.

Scenario


The sample code will perform the following.
  • Given a public URL of an image, the image will be retrieved. 
  • The image is sent to the vision API for analysis. 
  • The image is sent to the vision API produce a thumbnail. 

Vision API


The Computer Vision API Version 1.0 supports many activities on images. Here is a summary of the supported actions:
The first step using the API is to create a Cognitive Services in an Azure subscription:


There are several APIs available and in our scenario we are interested in the Vision API:

There are two pricing tiers currently available: free and S1 Standard.  You can have one free one per subscription as indicated in the image below:


Retrieving the image


This is simple enough thanks to the HttpClient and the GetByteArrayAsync method. After the responses from the vision API are retrieved, they are combined in a view model and returned as a JsonResult:
using (var client = new HttpClient())
{
    byte[] byteData = await client.GetByteArrayAsync(url);
 
    var celebritiesResponse = AnalyseCelebrities(byteData);
    var thumbnailResponse = GetThumbnail(byteData);
 
    return new JsonResult(new Model(celebritiesResponse, thumbnailResponse));
}
 

Vision API Analyze


The Vision API analyze supports several capabilities supported by supplying different arguments.  In this example, we are posting the data as a byte array and returning the JSON response as a string:
private string AnalyseCelebrities(byte[] byteData)
{
    string requestParameters = "visualFeatures=Categories,Tags,Description,Faces,ImageType,Color,Adult&details=Celebrities&language=en";
 
    string uri = "https://westus.api.cognitive.microsoft.com/vision/v1.0/analyze?" + requestParameters;
 
    HttpResponseMessage response;
 
    using(var client = new HttpClient())
    using (var content = new ByteArrayContent(byteData))
    {
        client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", "<key>");
 
        // This example uses content type "application/octet-stream".
        // The other content types you can use are "application/json" and "multipart/form-data".
        content.Headers.ContentType = new MediaTypeHeaderValue("application/octet-stream");
        response = client.PostAsync(uri, content).Result;
              
        return response.Content.ReadAsStringAsync().Result;
    }
}
 
Parsing the result is then a simple exercise using Json.Net:
var celebrities = JsonConvert.DeserializeObject<CelebritiesResponse>(celebritiesResponse);

The following is a basic class structure matching the response but care should be used as this was written during the preview of Cognitive Services:
public class CelebritiesResponse
{
    public CategoriesSection[] categories { get; set; }
    public AdultSection adult { get; set; }
    public TagSection[] tags { get; set; }
    public DescriptionSection description { get; set; }
    public FaceSection[] faces { get; set; }
    public ColorSection color { get; set; }
    public ImageTypeSection imageType { get; set; }
}
 
public class CategoriesSection
{
    public string name { get; set; }
    public double score { get; set; }
    public DetailsSection detail { get; set; }
}
public class DetailsSection
{
    public CelebritiesSection[] celebrities { get; set; }
}
 
public class CelebritiesSection
{
    public string name { get; set; }
    public double confidence { get; set; }
}
 
public class ImageTypeSection
{
    public int clipArtType { get; set; }
    public int lineDrawingType { get; set; }
}
public class ColorSection
{
    public string dominantColorForeground { get; set; }
    public string dominantColorBackground { get; set; }
    public bool isBWImg { get; set; }
}
public class FaceSection
{
    public int age { get; set; }
    public string gender { get; set; }
}
public class AdultSection
{
    public double adultScore { get; set; }
    public double racyScore { get; set; }
}
public class DescriptionSection
{
    public CaptionSection[] captions { get; set; }
}
 
public class TagSection
{
    public string name { get; set; }
    public string confidence { get; set; }
}
 
public class CaptionSection
{
    public string text { get; set; }
    public string confidence { get; set; }
}

Generate Thumbnail 


Generating a thumbnail is just as simple.  One thing to note is the response is kept as a base64 string to allow for displaying in HTML more efficiently:
private string GetThumbnail(byte[] byteData, int width = 300, int height = 300)
{
    string uri = $"https://westus.api.cognitive.microsoft.com/vision/v1.0/generateThumbnail?width={width}&height={height}&smartCropping=true";
 
    var client = new HttpClient();
                         
    client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", "<key>");
 
    HttpResponseMessage response;
 
    using (var content = new ByteArrayContent(byteData))
    {
        content.Headers.ContentType = new MediaTypeHeaderValue("application/octet-stream");
        response = client.PostAsync(uri, content).Result;
 
        var base64 = Convert.ToBase64String(response.Content.ReadAsByteArrayAsync().Result);
        return String.Format("data:image/gif;base64,{0}", base64);
    }
}

Summary

Vision API is one of several powerful and easy to use services available in Azure.  The documentation is exceptional and includes an overview and how-to and quick starts in multiple languages.

Additionally there is a live API that can be used to call the services without writing any lines of code and is a great way to get started.  The Computer Vision API - v1.0 allows you to specify the query parameters and subscription key (after specifying the appropriate data-center):