How to use the Azure Custom Vision SDK to Implement Object Classification Vision Models

Welcome to today’s post.

In today’s post I will be showing you how to use the Azure Custom Vision Training and Prediction API to create, train, publish and test a custom vision object classification training model.

In a previous post, I showed how to use the Azure Custom Vision Portal to create, train, publish and test a custom vision object classification training model.

When we want to create a client application to create and train a custom vision model, this is where the use of the Custom Vision API as required. As we have seen within the Custom Vision Portal, the end-to-end process, which involves creation of the project, uploading images, tagging the images, training the images, publishing the custom vision model, to testing the custom vision model with a prediction API, is a set of tasks that can be automated. I will show how this is done through the Custom Vision Training and Prediction API.

Configuration of the Environment for Developing Custom Vision Models

When we use the Custom Vision Training and Prediction API, the development environment I will use is the Visual Studio 2022 Development Environment.

Before we use the Visual Studio environment for development, we will need to setup some configuration keys and values that will be used to configure and create instances of the training and prediction custom vision resources.

The following are the configuration keys that will be required to create instances of and make calls to the Custom Vision Training and Prediction API libraries for the purpose of creating and managing Custom Vision models.

For Custom Vision Training models, the following configurations are required:

Training Resource Name

Training API Endpoint

The Training API Endpoint URL is of the form:

https://[resource-name].cognitiveservices.azure.com/

For Custom Vision Predictions, the following configurations are required:

Prediction Resource Name

Prediction API Endpoint

Prediction Resource Id

The Prediction resource name is suffixed with -prediction.

The Prediction API Endpoint URL is of the form:

https://[resource-name]-prediction.cognitiveservices.azure.com/

The Prediction Resource Id is of the form:

/subscriptions/{subscription-id}/resourceGroups/{resource-group}/providers/Microsoft.CognitiveServices/accounts/[{resource-name}-Prediction]

Note: You can find the prediction resource ID on the prediction resource’s Properties tab in the Azure portal, listed as Resource ID.

Once you have the above values, to initialize the values you can use the Azure CLI command line, which is accessible from the Azure Cloud Shell or within the Azure CLI within a local development workstation environment.

Below are the Azure CLI commands that we can apply to setup the above configuration values:

setx VISION_TRAINING_KEY [training-key]
setx VISION_TRAINING_ENDPOINT [training-endpoint]
setx VISION_PREDICTION_KEY [prediction-key]
setx VISION_PREDICTION_ENDPOINT [prediction-endpoint]
setx VISION_PREDICTION_RESOURCE_ID [prediction-resource-id]

After applying the above configuration changes, they are accessible within the Visual Studio environment.

Setup of the Training and Prediction Custom Model Library SDK

As I mentioned in the previous section, to use the Custom Vision SDK, the development environment we will use is the Visual Studio 2022 Development Environment. After initializing the resource names and keys for the Training and Prediction custom vision resources, we can start building our client application.

The first dependency that is required in the application are the installation of the NuGet packages:

Microsoft.Azure.CognitiveServices.Vision.CustomVision.Training

Microsoft.Azure.CognitiveServices.Vision.CustomVision.Prediction

In the source files that use the custom vision resources, we will need to add the following library namespaces:

using Microsoft.Azure.CognitiveServices.Vision.CustomVision.Prediction;
using Microsoft.Azure.CognitiveServices.Vision.CustomVision.Training;
using Microsoft.Azure.CognitiveServices.Vision.CustomVision.Training.Models;
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text.Json;
using System.Threading;

Within the Program class, we declare any static variables that will store the configuration values for each of our custom vision resources:

internal class Program
{
    static string trainingEndpoint;
    static string trainingKey;
    static string predictionEndpoint;
    static string predictionKey;
    static string predictionResourceId;
    static string publishedModelName;
    …
}

In addition, we will require variable declarations for the three sets of images that we will be using for training.

We will use lists to store the folder sources for each class of image:

static List<string> spoonImages;
static List<string> forkImages;
static List<string> knifeImages;

Each class of image requires a tag, which is declared as follows:

static Tag spoonTag;
static Tag forkTag;
static Tag knifeTag;

For the current training testing iteration, we will need to store the iteration state, which is declared as follows:

static Iteration iteration;

When uploading the test image for use for prediction tests, we store the image contents in memory using the following declaration:

static MemoryStream testImage;

In the Main() method of the Program class, we will implement the methods that are used to do the following:

Create instances of the Custom Vision Training API and Custom Vision Prediction API.
Create a new Training project for the Custom Vision Training API.
Create Tags for each image classification.
Upload image files from disk into memory for training.
Upload image files from disk into memory for testing.
Add and tag in-memory files into the Custom Vision Project’s Training model.
Train the tagged images that are stored in the Custom Vision Project as an Iteration.
Publish the latest training iteration of the Custom Vision trained model as a Prediction API using the Prediction resource.
Test the iteration of the Custom Vision trained model using the published Prediction API.
Display the results of the tests using the Prediction API.
Unpublish the latest training iteration of the Custom Vision trained model.
Remove the Custom Vision training project (including the model).

The diagram below provides a summary of most of the tasks mentioned above:

The above seems like a lot of tasks. I will provide methods that implement each of the above in the next section.

Creation of Training and Prediction Resources and Projects

In the first part of the implementation, I will show how to resource keys and initialized, how the resources are created, and how the custom vision training project is created. This comprises tasks 1-2 above.

Below is the first part of the Main() method, where we initialize the variables and create the custom vision resources.

static void Main(string[] args)
{
        InitializeVariables();

        try
        { 
               CustomVisionTrainingClient trainingApi = AuthenticateTraining(
                   trainingEndpoint,
                   trainingKey);

               CustomVisionPredictionClient predictionApi = AuthenticatePrediction(
                   predictionEndpoint,
                   predictionKey);
               ….

When creating and authenticating to each of the resources, we use a key for each of the training and predictive custom vision resource.

The variable initialization method is shown below:

private static void InitializeVariables()
{
    trainingEndpoint = Environment.GetEnvironmentVariable("VISION_TRAINING_ENDPOINT");
    trainingKey = Environment.GetEnvironmentVariable("VISION_TRAINING_KEY");
    predictionEndpoint = Environment.GetEnvironmentVariable("VISION_PREDICTION_ENDPOINT");
    predictionKey = Environment.GetEnvironmentVariable("VISION_PREDICTION_KEY");
    predictionResourceId = Environment.GetEnvironmentVariable("VISION_PREDICTION_RESOURCE_ID");
    publishedModelName = "cutleryClassModel";
}

The resource authentication methods are shown below:

private static CustomVisionTrainingClient AuthenticateTraining(
string? trainingEndpoint, string? trainingKey)
{
        // Create the training API, passing in the training key
        CustomVisionTrainingClient trainingApi = new CustomVisionTrainingClient(new Microsoft.Azure.CognitiveServices.Vision.CustomVision.Training
.ApiKeyServiceClientCredentials(trainingKey))
        {
            Endpoint = trainingEndpoint
        };
        return trainingApi;
}

private static CustomVisionPredictionClient AuthenticatePrediction(
string? predictionEndpoint, string? predictionKey)
{
        // Create the prediction API, passing in the prediction key
        CustomVisionPredictionClient predictionApi = new CustomVisionPredictionClient(new Microsoft.Azure.CognitiveServices.Vision.CustomVision.Prediction
.ApiKeyServiceClientCredentials(predictionKey))
        {
            Endpoint = predictionEndpoint
        };
        return predictionApi;
}

The Custom Vision Training resource requires an instance of CustomVisionTrainingClient to make use of the Custom Vision Training SDK, and the Custom Vision Prediction resource requires an instance of CustomVisionPredictionClient to make use of the Custom Vision Prediction SDK.

Before we can add tagged images to the training project, we require the creation of a Custom Vision project. This is done with the following call:

Project project = CreateProject(trainingApi, "CutleryImageClassificationProject");

Where the CreateProject() method is implemented as shown:

// Create a Custom Vision Project
private static Project CreateProject(CustomVisionTrainingClient trainingApi, 
string projectName)
{
    // Create a new custom vision project
    Console.WriteLine($"Creating new custom vision project: {projectName}");
    return trainingApi.CreateProject(projectName);
}

After the project is created successfully, it will return a Project instance with the Status property set to “Succeeded”.

In the next section, I will show how to add tags, then upload and tag images to the training dataset. I will also show how to upload a file used for later testing using the Prediction API.

Uploading to Projects of Tagged Images within Training Datasets

In this section, I will show how to add tags, then upload and tag images to the training dataset.

To add the image classification tags to the project. This is done as shown:

AddTags(trainingApi, project);

The AddTags() method implementation is shown below:

// Add Tags
private static void AddTags(CustomVisionTrainingClient trainingApi, Project project)
{
    // Create the image tags for the project.
    spoonTag = trainingApi.CreateTag(project.Id, "Spoon");
    forkTag = trainingApi.CreateTag(project.Id, "Fork");
    knifeTag = trainingApi.CreateTag(project.Id, "Knife");
}

The Training API method for creating each image classification tag is:

trainingApi.CreateTag(Guid project-id, string tag-name);

A sample image tag instance is shown below:

The next step is to request the file upload folder, then load the training image file names and full path from the folder path. This is done as follows:

Console.WriteLine("Enter the folder path for the image input files:");
string? folderPath = String.Empty;
folderPath = Console.ReadLine();
LoadImageFilenamesFromDisk(folderPath);

The method LoadImageFilenamesFromDisk() implementation is shown below:

// Get folders for each set of images.
private static void LoadImageFilenamesFromDisk(string imageFolder)
{
    // this loads the images file names into lists.
    spoonImages = Directory.GetFiles(Path.Combine(imageFolder, "Spoon")).ToList();
    forkImages = Directory.GetFiles(Path.Combine(imageFolder, "Fork")).ToList();
    knifeImages = Directory.GetFiles(Path.Combine(imageFolder, "Knife")).ToList();
}

A sample list of image file names is shown below:

Given we have the lists of test image file names, we upload the images into a memory stream and then upload the contents of the memory stream into the training project’s custom model with the classification tag associated with the batch of images. This is done as follows:

UploadImages(trainingApi, project, spoonImages, spoonTag);
UploadImages(trainingApi, project, forkImages, forkTag);
UploadImages(trainingApi, project, knifeImages, knifeTag);

Below is the implementation of UploadImages():

// Upload images individually for each image folder.
private static void UploadImages(CustomVisionTrainingClient trainingApi,
            Project project, List<string> imageFiles, Tag imageTag)
{
        // Add some images to the tags
        Console.WriteLine($"Uploading training images for tag {imageTag.Name} ...");
        Console.WriteLine();

        // Images can be uploaded one at a time
        foreach (var image in imageFiles)
        {
            using (var stream = new MemoryStream(File.ReadAllBytes(image)))
            {
                trainingApi.CreateImagesFromData(
                    project.Id,
                    stream,
                    new List<Guid>()
                    {
                        imageTag.Id
                    }
                );
            }
        }

        Console.WriteLine($"Finished uploading training images for tag {imageTag.Name}.");
        Console.WriteLine();
}

The above method is general for any specified training API, project, tag and list of files.

The Training API method below is used to upload the contents of the memory stream to the project’s image training dataset with image tags:

trainingApi.CreateImagesFromData(
    Guid project-id, 
    MemoryStream stream, 
    List<Guid> image-tag-id-list 
)

Below is the display we see after uploading all three sets of images to the project training data set:

The next step is to upload the test image from the same folder path as above into a local memory stream. This is done as shown:

UploadTestImage(folderPath);

The method UploadTestImage() implementation is shown below:

// Upload test image from folder into memory stream.
private static void UploadTestImage(string imageFolder)
{
    // Add some images to the tags
    Console.WriteLine("Uploading Test Image ...");
    Console.WriteLine();

    string testImageFile = Path.Combine(imageFolder, "TestImage.jpg");

    testImage = new MemoryStream(File.ReadAllBytes(testImageFile));
}

After establishing the image training data and tags, we are ready to train the project to produce a custom vision classification model, then publish the model for consumption. I will show how this is done in the next section.

Execution of Training and Publishing for the Project

The project now contains tagged and training images that will be used for object classification. To execute training for the project, we make the following call:

TrainProject(trainingApi, project);

The TrainProject() implementation is shown below:

// Train the Project
private static void TrainProject(CustomVisionTrainingClient trainingApi, Project project)
{
    // With the tagged images prepared, we start training the project.
    Console.WriteLine("Start Training ...");
    Console.WriteLine();

    iteration = trainingApi.TrainProject(project.Id);

    // The returned iteration will be in progress.
    // We will query its progress.
    while (iteration.Status == "Training")
    {
        Console.WriteLine("Waiting 10 seconds for training to complete...");
        Thread.Sleep(10000);

        // Re-query the iteration to get it's updated status
        iteration = trainingApi.GetIteration(project.Id, iteration.Id);
    }
}

To commence project training and obtain a training iteration, we make the following Training API call:

Iteration trainingApi.TrainProject(Guid project_id);

To obtain the status of the training run, we read the Status property of the Iteration class, which can return one of the following results:

Training

Completed

Where the Training result is while the training is still in progress, and the Completed result is when training has completed.

To re-query the training status, we re-read the iteration instance with the Training API call:

Iteration trainingApi.GetIteration(Guid project-id, Guid iteration-id);

Below are the results of the same training iteration run when viewed in the Custom Vision Portal:

When training has completed, we can then publish the training model as a prediction API. This is done as follows:

PublishIteration(trainingApi, project);

The method PublishIteration() implementation is shown below:

// Publish Iteration
static void PublishIteration(CustomVisionTrainingClient trainingApi, Project project)
{
    trainingApi.PublishIteration(
        project.Id,
        iteration.Id,
        publishedModelName,
        predictionResourceId);

    Console.WriteLine("Training Completed!");
    Console.WriteLine();
}

The main Training API method used to publish the custom vision model as a Prediction API is:

trainingApi.PublishIteration(
    Guid project-id,
    Guid iteration-id,
    string published-model-name,
    string prediction-resource-id);

In the next section, I will show how to run a prediction test with he published Prediction API.

Running Tests with the Prediction API

With the published custom vision model, we can test the model by making predictions using the Prediction API. With calls to the prediction API, we pass in an instance of CustomVisionPredictionClient. This is done as shown:

TestIteration(predictionApi, project);

The TestIteration() method is implemented as shown:

// Test Iteration
private static void TestIteration(CustomVisionPredictionClient predictionApi, 
Project project)
{
    // Make a prediction against the new project
    Console.WriteLine("Making a prediction from a test image:");
    var result = predictionApi.ClassifyImage(
        project.Id,
        publishedModelName,
        testImage
    );

    // Loop over each prediction and write out the results
    foreach (var c in result.Predictions)
    {
        Console.WriteLine($"\t{c.TagName}: {c.Probability:P1}");
    }
}

The main Prediction API method for kicking off a prediction for a test image is defined below:

ImagePrediction predictionApi.ClassifyImage(
    Guid project-id,
    String published-model-name,
    MemoryStream test-image
);

Where the ImagePrediction class contains the results of the image prediction in the Predictions property. Two useful properties within the Predictions property that we use for reporting are:

TagName

Probability

Where TagName is the name of the tag for the classified object, and Probability is the confidence that the object has that classification.

Below is a sample run of output in the console:

In the Custom Vision Portal, the same prediction on the testing image thumbnail is available for viewing as shown:

And the corresponding image detail is shown:

Removing the Custom Vision Training Project

After you have made use of your custom Vision Training project, if you no longer require the model, you can remove it. I will show how this is done.

Before you can remove a training project, you will need to unpublish the training iteration of the project.

Attempting to delete the training project without unpublishing the training iteration results in the following error:

The following call unpublishes the training iteration:

UnPublishIteration(trainingApi, project);

The implementation of UnPublishIteration() is shown below:

// UnPublish Iteration
static void UnPublishIteration(CustomVisionTrainingClient trainingApi, Project project)
{
    trainingApi.UnpublishIteration(
        project.Id,
        iteration.Id);

    Console.WriteLine("Iteration Unpublished!");
    Console.WriteLine();
}

The Training API method that performs the unpublishing of the iteration is:

trainingApi.UnpublishIteration(
    Guid project-id,
    Guid iteration-id
);

The above is equivalent to the following action in the Custom Vision portal:

To remove the project, we make the following call:

DeleteProject(trainingApi, project);

The method DeleteProject() implementation is shown below:

private static void DeleteProject(CustomVisionTrainingClient trainingApi, Project project)
{
    Console.WriteLine("Deleting Project ...");
    Console.WriteLine();

    string projectName = project.Name;

    trainingApi.DeleteProject(project.Id);

    Console.WriteLine($"Project {projectName} deleted.");
    Console.WriteLine();
}

The Training API method that performs the deletion of the project is:

trainingApi.DeleteProject(Guid project-id);

The above is equivalent to pressing the trash bin icon in the project gallery screen a shown:

In the next section I will explain how to catch and display errors when running the above custom vision processing tasks:

Error Handling of Custom Vision Project Training Tasks

To handle errors within a Custom Vision training run, we include the calls to our custom methods within a try … catch block.

Inside the catch block, we catch exceptions of type CustomVisionErrorException, which are in the namespace:

Microsoft.Azure.CognitiveServices.Vision.CustomVision.Training.Models

The code block with the error handler is shown below:

try
{ 
    CustomVisionTrainingClient trainingApi = AuthenticateTraining(
        trainingEndpoint,
        trainingKey);

    CustomVisionPredictionClient predictionApi = AuthenticatePrediction(
        predictionEndpoint,
        predictionKey);

    Project project = CreateProject(trainingApi, "CutleryImageClassificationProject");

    Console.WriteLine("Enter the folder path for the image input files:");
    string? folderPath = String.Empty;
    folderPath = Console.ReadLine();

    AddTags(trainingApi, project);

    LoadImageFilenamesFromDisk(folderPath);

    UploadImages(trainingApi, project, spoonImages, spoonTag);
    UploadImages(trainingApi, project, forkImages, forkTag);
    UploadImages(trainingApi, project, knifeImages, knifeTag);
    UploadTestImage(folderPath);

    TrainProject(trainingApi, project);

    PublishIteration(trainingApi, project);

    TestIteration(predictionApi, project);

    UnPublishIteration(trainingApi, project);

    DeleteProject(trainingApi, project);
}
catch (CustomVisionErrorException ex1)
{
    VisionServiceError error = 
        JsonSerializer.Deserialize<VisionServiceError>(json: ex1.Response.Content);
    Console.WriteLine($"A Vision Model Error has occurred: {error?.message}");
}
catch (Exception ex)
{
    Console.WriteLine($"An Error has occurred: {ex.Message}");
}

The errors from the exception class CustomVisionErrorException contain a Body property, which has a JSON string containing a code and message property.

In the step-though interactive debugger, the watch on the exception shows these properties with more a useful description in the Code and Message properties:

To extract the above error properties and display them, use a class with these properties:

internal class VisionServiceError
{
    public string? code { get; set; }
    public string? message { get; set; }
}

We then deserialize the JSON content of the exception body string into the above helper class:

VisionServiceError error = JsonSerializer.Deserialize<VisionServiceError>(
    json: ex1.Response.Content
);

The extracted errors from the JSON in the error body are shown when we attempt to remove a project that has an existing published iteration.

We have seen how to implement data uploads, tagging, training, publishing and testing for a custom vision object classification model using the Custom Vision Training and Prediction API.

In the next post, I will explore using the Custom Vision Training and Prediction API to train a custom vision object detection model.

That is all for today’s post.

I hope that you have found this post useful and informative.

Andrew Halil

Andrew Halil is a blogger, author and software developer with expertise of many areas in the information technology industry including full-stack web and native cloud based development, test driven development and Devops.

Post Views: 10

Tweet LinkedIn Facebook