Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Have bytes as supported value type. #844

Closed
pedropgusmao opened this issue Dec 20, 2021 · 6 comments
Closed

Have bytes as supported value type. #844

pedropgusmao opened this issue Dec 20, 2021 · 6 comments
Labels
enhancement New feature or request

Comments

@pedropgusmao
Copy link
Contributor

Is your feature request related to a problem? Please describe.
I need to instantiate an object with another one which is basically a dataclass instance like:

class Parameters:
    """Model parameters."""
    tensors: List[bytes] 
    tensor_type: str 

when trying to use this I get:
omegaconf.errors.ValidationError: Unsupported value type: <class 'bytes'>
Example:

from typing import List

from omegaconf import OmegaConf

@dataclass
class Parameters:
    """Model parameters."""
    tensors: List[bytes] 
    tensor_type: str 

# omegaconf.errors.ValidationError: Unsupported value type: <class 'bytes'>
OmegaConf.structured(Parameters)

Describe the solution you'd like
Accepting bytes.

Describe alternatives you've considered
Internally interpreting bytes as str (when parsing) then converting it to bytes.

Additional context
...

@Jasha10 Jasha10 added the enhancement New feature or request label Dec 21, 2021
@Jasha10 Jasha10 linked a pull request Dec 22, 2021 that will close this issue
@Jasha10
Copy link
Collaborator

Jasha10 commented Dec 22, 2021

Thanks for suggesting this @pedropgusmao.
Supporting bytes seems reasonable to me since bytes is a built-in type and since yaml supports the bytes type binary data natively.

@pixelb
Copy link
Collaborator

pixelb commented Jan 18, 2022

Thanks for the suggestion and implementation!

I agree this has some merit, as binary is a yaml primitive: https://yaml.org/type/binary.html
Playing devil's advocate, some thoughts against implementing this:

  • "binary" is just a base64 string, so why not do a little manual conversion in code to/from storing b64 in existing strings?
    • It would be good to have a concrete example of before/after code for this use case. BTW I vaguely remember a suggestion to support tensor as a primitive type, which had less merit, but some of the same considerations as this, as that's what's presented as the use case. Also slightly related to leaning on explicit conversion more is issue Value 'float64' is not a supported primitive type  #805 discussing numpy float handling.
  • We have to be careful to avoid becoming a data serialization mechanism. We're tuned for smaller amounts of config, rather than larger amounts of data

If we do support bytes, I'm not sure about the support for converting to/from strings. That would have to be restricted to utf-8, so I don't see the use case for this conversion support

@Jasha10
Copy link
Collaborator

Jasha10 commented Jan 20, 2022

I've agree about the conversion support -- it's cleaner in my opinion if assigning str to a bytes field fails and assigning bytes to a str field fails.
cc @jieru-hu

@Jasha10
Copy link
Collaborator

Jasha10 commented Jan 20, 2022

* "binary" is just a base64 string, so why not do a little manual conversion in code to/from storing b64 in existing strings?
  
  * It would be good to have a concrete example of before/after code for this use case. 

@pedropgusmao, would you mind giving a motivating example?

@pedropgusmao
Copy link
Contributor Author

pedropgusmao commented Jan 20, 2022

Hello @Jasha10 and @pixelb ,
Thank you both for the comments.

Tbh, my use case is not really related to conversion the str <-> bytes , nor is it directly related to loading bytes from .yaml files.
Instead, I'm looking for simple/clean way to set all my regular parameters (str, float, etc...) and to instantiate different implementations of an given interface inside a single .yaml config file.

As you must definitely know, both things can be achieved using Hydra, where calling hydra.utils.instantiate(cfg.obj,named_arg=parameters) would solve for the instantiation part of the problem. However, even if parameters is created inside my python script (i.e. not inside the '.yaml' file), I get the error seen at the beginning of this Issue. Allowing bytes to be supported in OmegaConf would allow Hydra to accept bytes-containing data classes as input to its hydra.utils.instantiate.

Forgot to mention: This will also help generations to come... eventually.

@Jasha10
Copy link
Collaborator

Jasha10 commented Mar 10, 2022

Support for bytes is now implemented on OmegaConf's master branch and can be tested by installing the OmegaConf 2.2.0.dev1 development release from pypi.

@Jasha10 Jasha10 closed this as completed Mar 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants