[ImageClassification Transfer Learning] Higher level API supporting image predictions based on image file paths and in-memory images, alternatively #4084

CESARDELATORRE · 2019-08-07T20:12:17Z

This is probably more for the long-term API of our ImageClassification Transfer Learning once the foundational features for our 'ImageClassification Transfer Learning' are in place.

However, this is the point:

We aim to design and implement a high level API which will provide a very straightforward way of training (transfer learning) image classification and other capabilities such as object detection, etc. That also means, and is equally important, to be able to score/predict very easily according to normal scenarios which for images are two scenarios:

A. Score based on a provided image file path.
B. Score based on an in-memory image.

The second choice (B) is even more common for many apps, even more when moving to OBJECT DETECTION and LIVE VIDEO/IMAGES coming as streaming.

CURRENT STATE IN ML.NET API

The issue is with the current API design of ML.NET where the ML.NET model scoring API completely depends on the way you created its original pipeline and what schema you were using, meaning:

If you trained an ML.NET model with image file paths, your ML.NET model expects an image file path for scoring.
If you trained an ML.NET model with in-memory images (not a common/straightforward approach), your ML.NET model expects an in-memory image (Bitmap) when scoring.

However, that's not the typical scenario for a user where the easiest way for training is using image filepaths (that's the way you usually have image sets, right?), but then when scoring/predicting the user should be able to provide either a file path or an in-memory image (in-memory is more common for end-user apps vs. filepaths for batch processes) and anything else should be transparent for him.

CURRENT WORKAROUND:

Sure, you can always go deeper and deep dive in the TensorFlow .pb model that was also generated by our "ImageClassification Transfer Learning", find out the input and output tensor names, implement a different ML.NET pipeline that accepts in-memory images, create that ML.NET model by running that pipeline once (it is not really training, it is just creating the needed transformers for scoring later on) and finally writing a more specific and not very straightforward code for scoring with in-memory images.

That process is what I wrote in the second part of this BLOG POST and related sample app which is using an in-memory image coming through HTTP and was provided by the end-user for predicting its class:

https://devblogs.microsoft.com/cesardelatorre/run-with-ml-net-c-code-a-tensorflow-model-exported-from-azure-cognitive-services-custom-vision/

But that code for scoring a TF model with in-memory images is not straightforward to implement.

INITIAL SOLUTION: Load images from files but convert them into in-memory image objects before loading it into the IDataView, so the schema would match for training and scoring

An initial good solution is to Load images from files but convert them into in-memory image objects before loading it into the IDataView, so the data class for the schema would be something like the following:

Instead of the following:

    public class ImageData
    {
        [LoadColumn(0)]
        public string ImagePath;

        [LoadColumn(1)]
        public string Label;
    }

We want to have something like this being used by the initial IDataView:

    public class ImageInputData
    {
        public Bitmap Image { get; set; }
        
        public string Label;
    }

That way, the ML.NET model's schema that we have when training would match the schema data class needed when scoring by only having an in-memory image without the user needing to go deeper and use the TensorFlow model .pb and creating his own pipeline for scoring, that I explin in this blog post but it is too complex if we want users to use a high-level API:

https://devblogs.microsoft.com/cesardelatorre/run-with-ml-net-c-code-a-tensorflow-model-exported-from-azure-cognitive-services-custom-vision/

POSSIBLE FUTURE HIGH LEVEL API for Image Classification, ObjectDetection and other high level SCENARIOS

The point is that we want to create high level APIs targeting SCENARIOS (Image Classification, ObjectDetection, etc.). That means the API to use for the mentioned use cases (training and scoring) should also be very simple. It is not acceptable to have such as complexity (see BLOG POST above) if you want to score with an in-memory image.

Somehow we should solve the conflict between the current ML.NET pipeline API requirements (you score with the same schema you trained) and the points I explained above.
We probably will need to create a higher level API on top of the current pipelines API which would be more oriented to SCENARIOS (ImageClassification, Object Detection). The current API in the ML.NET pipelines doesn't probably allow what I'm explaining..

Also, that approach doesn't make our solution transparent to the underneath DNN architecture/framework (TensorFlow/Torch, etc.) since it needs the user to have specific code depending if he/she is using TensorFlow or Torch, because the user needs to take the "under the covers" TF .pb model and implement code for scoring which might also be specific for TensorFlow vs. Torch in the future.

Moving forward, this discussion will be even more important for OBJECT DETECTION where in-memory image scoring is critical (streaming live video/images) while users might want to train based on image paths which is simpler.

The text was updated successfully, but these errors were encountered:

antoniovs1029 · 2020-01-09T22:32:06Z

Seems this was fixed in #4151

CESARDELATORRE added the enhancement New feature or request label Aug 7, 2019

CESARDELATORRE assigned codemzs Aug 7, 2019

ssaporito mentioned this issue Aug 20, 2019

Re-using the same Dataview with Bitmaps in memory, breaks when fitting different models or run cross validation on it #4126

Closed

ganik added the P2 Priority of the issue for triage purpose: Needs to be fixed at some point. label Aug 20, 2019

codemzs mentioned this issue Aug 28, 2019

Image classification preview 2. #4151

Merged

antoniovs1029 closed this as completed Jan 9, 2020

ghost locked as resolved and limited conversation to collaborators Mar 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ImageClassification Transfer Learning] Higher level API supporting image predictions based on image file paths and in-memory images, alternatively #4084

[ImageClassification Transfer Learning] Higher level API supporting image predictions based on image file paths and in-memory images, alternatively #4084

CESARDELATORRE commented Aug 7, 2019 •

edited

Loading

antoniovs1029 commented Jan 9, 2020

[ImageClassification Transfer Learning] Higher level API supporting image predictions based on image file paths and in-memory images, alternatively #4084

[ImageClassification Transfer Learning] Higher level API supporting image predictions based on image file paths and in-memory images, alternatively #4084

Comments

CESARDELATORRE commented Aug 7, 2019 • edited Loading

antoniovs1029 commented Jan 9, 2020

CESARDELATORRE commented Aug 7, 2019 •

edited

Loading