Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include the "save file" action (.ZIP file) as part of model.SaveFile() #1689

Closed
CESARDELATORRE opened this issue Nov 21, 2018 · 6 comments · Fixed by #2993
Closed

Include the "save file" action (.ZIP file) as part of model.SaveFile() #1689

CESARDELATORRE opened this issue Nov 21, 2018 · 6 comments · Fixed by #2993
Assignees

Comments

@CESARDELATORRE
Copy link
Contributor

Currently, whenever we are saving a .ZIP model file, you always need to handle the code for the File stream class.

I think that is repetitive code that could be included in the API.
Instead of:

using (var fs = new FileStream(ModelPath, FileMode.Create, FileAccess.Write, FileShare.Write))
                mlContext.Model.Save(trainedModel, fs);

We could have an overridden method so I could simply do the following:

mlContext.Model.Save(trainedModel, modelPath);

Or even the following, if those methods were part of the model class itself instead of a utility class in the MLContext:

model.Save(modelPath);

Currently, this kind of line using the file stream class is something you need to repeat over and over in every/most training app (even when the constructor can be simplified):

using (var fs = new FileStream(ModelPath, FileMode.Create, FileAccess.Write, FileShare.Write))

Handling a FileStream object might should not be mandatory for the user but optional:

Of course, I would also maintain the current methods because in some cases you might want to just provide an existing stream object, but as opt-in.

@Zruty0
Copy link
Contributor

Zruty0 commented Nov 21, 2018

We already have model.SaveTo to save the model.
@CESARDELATORRE , we wanted to follow up and see how the other .NET libraries go about IO: do they provide a pair of functionality for their IO, or only one?
By pair I mean one method to IO from a stream, and another to IO from a string filePath.

@Zruty0
Copy link
Contributor

Zruty0 commented Nov 21, 2018

Also, there are shorter ways to create a file stream:

using (var fs = File.Create(ModelPath))

@CESARDELATORRE
Copy link
Contributor Author

CESARDELATORRE commented Nov 22, 2018

Important to note is that, as I mentioned, it should also have the possibility of using an existing file stream.
But in most of the cases you just want the simplest way which is to save the model's .ZIP file in a path, in a single line:

model.Save(mlContext, filePath);

For instance, in the case of the XMLDocument, it provides 4 overloads, by using stream, just the filepath, TextWriter and XMLWriter:

https://docs.microsoft.com/en-us/dotnet/api/system.xml.xmldocument.save?view=netframework-4.7.2

A few examples of .NET libraries where you just need to provide the file path name without having to deal with any stream:

XML Document

XmlDocument doc = new XmlDocument();
doc.LoadXml("<item><name>wrench</name></item>");
// change XML, etc...
doc.Save("MyFile.xml");

WORD Automation in C#:

// Create new document.
var doc = new DocumentModel();
// Working with content...
// Save as PDF file.
doc.Save("Sample.pdf");

VS Document.Save(path) Method

https://docs.microsoft.com/en-us/previous-versions/visualstudio/visual-studio-2010/c0hhd7xt(v%3dvs.100)

VS Graph Save(path) method

https://docs.microsoft.com/en-us/previous-versions/visualstudio/visual-studio-2010/ff655497(v%3dvs.100)

In Python / Scikit-Learn you also use a single API step: dump

dump(model, 'filePathName')

In contrast, we currently need to handle the FileStream object in between:

This is what we have now:

using (var fs = File.Create(ModelPath))
         trainedModel.SaveTo(mlContext, fs);

or this

using (var fs = new FileStream(ModelPath))
            mlContext.Model.Save(trainedModel, fs);

But the simplest step would be to just provide the filePath:

trainedModel.SaveTo(mlContext, filePath);

Btw, I guess we always need to provide the mlContext in order to save the model sa a file, right?

In any case, this is a minor improvement/simplication, not a big deal, but since the implementation is simple, that's why I'm suggesting it. :)

@JRAlexander
Copy link

Can this also be made Async?

@glebuk
Copy link
Contributor

glebuk commented Jan 14, 2019

Please ensure that for model saving and loading there exist both stream and string based overloads.

@eerhardt
Copy link
Member

Related #2983

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants