Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Support for 16 bit integer or 32 bit float single channel images #3568

Open
j99ca opened this issue May 31, 2024 · 5 comments
Assignees
Labels
FEATURE New feature & functionality OTX 2.0 For OTX v2.0

Comments

@j99ca
Copy link

j99ca commented May 31, 2024

My team and I have been using OpenVINO for a couple of years now, deploying our converted Tensorflow models to the edge as IR Format models. Recently I have been trying to move towards using Pytorch, and specifically this library. It's been great for our RGB datasets, but we often deal with thermal and depth image data, and other scientific image formats.

With our Tensorflow based object detection training, we are able to work around the dataset type differences, but this library seems very set on RGB 8 bit data. We have often treated our thermal data as grayscale when fine-tuning RGB based classification or object detection models, but we can keep our precision high in our due to using 32 bit float values and interpreting between [0.0, 255.0]. The 8 bit limitation of this library means we can only have 256 temperature values being represented, without creating some palette transformation (non-grayscale mapping) which is not ideal in our use case.

A feature I would like to see is easy exposure and overriding image loading so we can load our images (non-RGB data) and transform them to [0.0, 255.0] floating point images. Even better would be the capability to override the model architecture to add some new input layers such that the model itself can take, say images in kelvin (single channel 32 bit float), and apply our custom transformations in-model so we can export the model to the IR format and keep the single channel -> 3 channel transformation and ensure portability when we deploy these models to the edge, where they will be ingesting images in kelvin (32 bit float single channel).

Thanks for the great work by the way. We are currently are using 1.6 but I hope to switch to 2.0 when it's ready. Speaking of 2.0, will it keep the ability to use custom/override image input sizes? That is a great feature for us.

@harimkang
Copy link
Contributor

@j99ca Thanks for suggestion!

@kprokofi @goodsong81
Aside from feature suggestions, we haven't yet added configurable input size functionality in 2.0 - it's in the backlog, of course, but we don't have an estimate of when it will be added yet. Shouldn't we be talking about when this feature is coming in?

@eunwoosh
Copy link
Contributor

Hi @j99ca,
As Harim said, configurable input size won't be included in OTX 2.0 release, but we have a plan to enable it later.
When it's enabled isn't decided but it will be enabled not long after 2.0 release.

@j99ca
Copy link
Author

j99ca commented Jun 10, 2024

@harimkang @eunwoosh understandable about configurable input size and timelines. How about the original request? This (and the configurable resolution) are the biggest barriers for our full adoption of this library. As I mentioned above, most of our installations are done with 16 bit single channel or floating point images.

Even just exposing the reading and decoding of the images so we can pass a custom function to the engine or dataloader so we can read our scientific image data directly and do the mapping to [0.0, 255.0] or [0.0, 1.0] float input that the models use (not uint8 but float32) and not lose detail as we could represent millions of unique temperature values within those bounds

@harimkang
Copy link
Contributor

@wonjuleee What do you think of this feature suggestion?

@wonjuleee
Copy link
Contributor

Hi all, as far as I guess, this sounds good for handling more precise data types in OTX.
But, this requires to modify both Datumaro and OTX.
Since Datumaro image loader (laze_image) can set dtype for image decoding as described in
https://github.com/openvinotoolkit/datumaro/blob/6a05715147b632442f63a0f669b7cf1c7e0c5a87/src/datumaro/util/image.py#L359, we need to slightly modify Datumaro ImageFromFile and this could be controlled through

with image_decode_context():
in OTX side.
This will be configured through something like data.config.dtype in the recipe.
I think this might be feasible for the next Datumaro/OTX version.
@kprokofi, what do you think?

@harimkang harimkang added FEATURE New feature & functionality OTX 2.0 For OTX v2.0 labels Jun 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FEATURE New feature & functionality OTX 2.0 For OTX v2.0
Projects
None yet
Development

No branches or pull requests

5 participants