Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define formal primitives specification #42

Open
micahjsmith opened this issue Nov 9, 2018 · 6 comments
Open

Define formal primitives specification #42

micahjsmith opened this issue Nov 9, 2018 · 6 comments
Labels
internal improvement This is an improvement that does not add new functionality Pending Review The bug is not confirmed or the feature request is being considered

Comments

@micahjsmith
Copy link

The JSON primitives files can be fairly large and complicated and susceptible to misspecification. Implement a formal specification for the JSON files using json schema

  • define the specification in JSON schema (I suggest using the jsonschema package from PyPI)

  • implement a validator to validate a JSON primitive file:

      from mlprimitives.validation import validate
      primitive = load_primitive_from_json()  # or similar
      primitive = validate(primitive)
    
@csala
Copy link
Contributor

csala commented Nov 12, 2018

I agree on using a json schema for this.
However, this should not be here but rather on MLBlocks.
Then, optionally, if we consider that MLPrimitives needs to add anything to it, it should extend the one from MLBlocks to add its own things.

Note that the primitive specification is mostly explained already in the MLBlocks documentation, even though it is not defined in a truly formal way.

Regarding the validation function, I totally agree on creating it, but also on the MLBlocks side.
Until this is done, the approach that we followed for a basic validation was creating an MLPipeline instance that uses the primitive, but this clearly is not good enough.

@micahjsmith
Copy link
Author

Suppose someone were to make a pull request to MLPrimitives to add a new primitive. Would it make sense that MLPrimitives needs to rely on an external library to decide if the primitive is valid? I think it makes sense to define the specification for primitives in MLPrimitives.

Certainly the existing documentation and validation works for our purposes so far but a jsonschema would formalize this and make our lives easier.

@micahjsmith
Copy link
Author

(Correction: I meant the jsonmodels package on PyPI: https://jsonmodels.readthedocs.io/en/latest/)

@csala
Copy link
Contributor

csala commented Nov 13, 2018

Suppose someone were to make a pull request to MLPrimitives to add a new primitive. Would it make sense that MLPrimitives needs to rely on an external library to decide if the primitive is valid?

Yes, absolutely. It would make sense. Basically because MLPrimitives is a collection of primitives specifically crafted to be compliant with this third party library.
And actually, you can ask the question in the opposite direction: Would it make sense for someone to be able to contribute a primitive to MLPrimitives that is not compliant with the MLBlocks format? Since the answer is no, making this JSON specification independent form MLBlocks makes no sense.

So yes, without any doubt, the specification has to be in MLBlocks, and MLPrimitives has to follow it. Or, at most, if additional things that belong only to MLPrimitives exist, extend it to add them.

@csala csala added the internal improvement This is an improvement that does not add new functionality label Feb 13, 2019
@csala csala added the Pending Review The bug is not confirmed or the feature request is being considered label Feb 25, 2019
@micahjsmith
Copy link
Author

@csala Bumping this. Has our thinking changed on the presence of a formal specification? I still suggest adding a json schema specification for ml primitives within the mlprimitives repo. Even the specification is sufficient; a validator is not strictly needed, though would be useful.

@csala
Copy link
Contributor

csala commented Jun 11, 2019

@micahjsmith this is pending to be discussed in the future, after some other changes get in first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
internal improvement This is an improvement that does not add new functionality Pending Review The bug is not confirmed or the feature request is being considered
Projects
None yet
Development

No branches or pull requests

2 participants