Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#5600] feat(API): Add Java API definition for ML model in Gravitino #5612

Merged
merged 5 commits into from
Nov 25, 2024

Conversation

jerryshao
Copy link
Contributor

What changes were proposed in this pull request?

This PR adds the API definition for ML model and model catalog.

Why are the changes needed?

This is the first step to support ML model management in Gravitino.

Fix: #5600

Does this PR introduce any user-facing change?

Yes.

How was this patch tested?

Manual verification.

@jerryshao jerryshao self-assigned this Nov 19, 2024
* @throws NoSuchModelException If the model does not exist.
* @throws ModelVersionAliasesAlreadyExistsException If the aliases already exist in the model.
*/
ModelVersion link(String[] aliases, String comment, String url, Map<String, String> properties)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have the interface unlink to change the version alias?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the change/alter/update support, I will do this in the next version. In this version, we only support register/link, get, list and delete operations, so I didn't add that interface.

*
* @return the properties of the model version.
*/
Map<String, String> properties();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to add the author and date for ModelVersion if this is a version management. I just want to refer to some design about git.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ModelVersion has audit info, so it will have the creator and date information for each version. We don't need to have a new property to specify this.

@jerryshao
Copy link
Contributor Author

@xloya @coolderli @zhoukangcn can you please also review this API design, to see if it is OK for you.

@coolderli
Copy link
Collaborator

LGTM.

@xloya
Copy link
Collaborator

xloya commented Nov 21, 2024

Sorry late for reply, I will take time to review it today.

@jerryshao
Copy link
Contributor Author

@xunliu @shaofengshi @FANNG1 @yuqi1129 @mchades @diqiu50 can you please also help to review?

*
* @return The versions of the model object.
*/
ModelVersion[] versions();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need return all versions here? If there are too many versions, I think this will lead to a significant decrease in query efficiency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of returning a version list, and then using the version number to get the specific model version, but I changed to this because I assumed that there may not be too many versions in one model. Do you know about the real scenario, will there be hundreds of versions in one model?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are many types of models within Xiaomi. Checkpoints generated during training in common frameworks such as Tensorflow/Pytorch will also be registered as models. In the long run, there may be a large amount of version information.

Copy link
Contributor Author

@jerryshao jerryshao Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. So how about returning a version list, then user can use the returned version list to get the specified model version detail?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xloya is it neccesary to record every checkpoint when training a model or just record the best model version?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jerryshao I think we could only return the latest version when getting the model metadata, and provide a api to getting all versions by the model, WDYT?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FANNG1 I don’t know much about this, but based on our current internal business, the model is updated almost every day.

* @return The model version object.
* @throws NoSuchModelVersionException if the version does not exist.
*/
ModelVersion version(int version) throws NoSuchModelVersionException;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it more scalable to make Model Version related operations independent of interfaces?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot get your meaning, can you explain more about the meaning of "scalable"?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally think that the frequency of model version metadata operations may be more frequent than some metadata of the model itself, so perhaps the model version operations can be made into a separate interface to avoid frequent modifications to the model interface implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about also moving these interfaces to ModelCatalog?

In my design doc, I put these interfaces in ModelCatalog, so users can directly operate the ModelVersion, no need to get the Model first, and then get the ModelVersion. But I changed to this to align with table/partition design and concept.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think these methods are probably more suitable for the definition of metadata operations rather than metadata definitions.

@jerryshao
Copy link
Contributor Author

@xloya @FANNG1 can you please take another review?

@xloya
Copy link
Collaborator

xloya commented Nov 25, 2024

LGTM

@FANNG1 FANNG1 merged commit dd87cdd into apache:main Nov 25, 2024
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Subtask]Design and implement the Java API for model and model operation
6 participants