Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Isolate ONNX implementations in separate DLL #462

Merged
merged 9 commits into from
Jul 5, 2018

Conversation

TomFinley
Copy link
Contributor

@TomFinley TomFinley commented Jun 30, 2018

Fixes #162.

Previously, the ONNX infrastructure and implementations (including refs to protobuf) were in a central DLL. This gave us a dependency on a separate somewhat large project (protobuf), that was only of interest to people saving ONNX models.

By having the components save themselves through abstract classes rather than actual instantiable classes (OnnxContext became an abstract class with a hidden OnnxContextImpl implementation, NodeProto and OnnxUtils became the OnnxNode abstract class with, likewise, a hidden implementation), there is no need for any "direct" dependency on protobuf.

All implementation classes become internal classes of the Microsoft.ML.Onnx project. (This was previously called Microsoft.ML.UniversalFormat due to historical reasons that no longer make sense.) The only public classes in that project are the entry-points and commands inside SaveOnnxCommand.cs, which instantiate actual implementors of those interfaces, then pass to ONNX savable components.

Also, I opportunistically improved documentation on those public interfaces (though even with docs the interfaces would scarcely make sense to someone unfamiliar with ONNX), and improved the code.

@@ -87,7 +86,7 @@ private static VersionInfo GetVersionInfo()
/// Returns the underlying data view of the composite loader.
/// This can be used to programmatically explore the chain of transforms that's inside the composite loader.
/// </summary>
internal IDataView View { get { return _view; } }
public IDataView View { get; }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public [](start = 8, length = 6)

Not sure about this one..

@TomFinley
Copy link
Contributor Author

TomFinley commented Jun 30, 2018

/// This data model component is savable as Onnx.

Should be ONNX. #Closed


Refers to: src/Microsoft.ML.Data/Model/Onnx/ICanSaveOnnx.cs:22 in 23f62cf. [](commit_id = 23f62cf, deletion_comment = False)

@TomFinley
Copy link
Contributor Author

TomFinley commented Jun 30, 2018

Build OSX10.13 Release #Closed

@TomFinley
Copy link
Contributor Author

@dotnet-bot test OSX10.13 Release

{
void AddAttribute(string argName, double value);
void AddAttribute(string argName, IEnumerable<double> value);
void AddAttribute(string argName, IEnumerable<float> value);
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Jul 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IEnumerable value); [](start = 42, length = 26)

IEnumerable value); [](start = 42, length = 26)

so everyone has a pair of T value and IEnumerable method, but float has only IEnumerable
I don't know is it wrong or not, but it just stands out. #Resolved

Copy link
Contributor Author

@TomFinley TomFinley Jul 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I think it's because double param will work for both float and double, but IEnumerable<double> will only work for IEnumerable<double>, not also IEnumerable<float>. #Resolved

void AddAttribute(string argName, IEnumerable<DvText> value);
void AddAttribute(string argName, IEnumerable<string> value);
void AddAttribute(string argName, string value);
void AddAttribute(string argName, bool value);
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Jul 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My "OCD" wants them to be formatted in either all T value methods first, and then all IEnumerable values methods, or as couples of T value method + IEnumerable value.
#Resolved

@eerhardt
Copy link
Member

eerhardt commented Jul 2, 2018

pushing the DLL into its own nuget so we could remove the protobuf dependency. However I am not sure how to do that. Maybe @eerhardt can help me here.

Sure, it is actually pretty simple once you know how. You can duplicate/emulate #392, which adds the LightGBM nupkg.

  1. Add <packageName>.nupkgproj and <packageName>.symbols.nupkgproj files under a pkg\<packageName> folder.
  2. Set the TFM, PackageDescription properties and references in <packageName.nupkgproj. (Be sure to ProjectReference="Microsoft.ML.nupkgproj").
  3. In your Microsoft.ML.Onnx.csproj, set <IncludeInPackage>Microsoft.ML.Onnx</IncludeInPackage> to put the output of that .csproj in the appropriate NuGet package.

///
///
/// </summary>
public interface IOnnxContext
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to make an abstract class instead? That way we can add more members to this type in the future without having to make an IOnnxContext2 type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, maybe. Though in this case, given that the point is that I want none of its implementation to be in here, I'm not certain that I see the benefit of a base class. Is there some article I could read that expresses this principle?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what I'd like to do is lock down this interface/base-class. Is there some way I can do that? Make a base class that no one can easily descend from, except some sort of "approved" implementing assembly? (In this case the onnx project, and also possibly some sort of test assembly.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://msdn.microsoft.com/en-us/library/ms229013(v=vs.100).aspx

Do favor defining classes over interfaces.
In later versions of your library, you can safely add new members to classes; you cannot add members to interfaces without breaking existing code.
Do use abstract (MustInherit in Visual Basic) classes instead of interfaces to decouple the contract from implementations.

Also, this isn't official msdn, but it appears to be copied from the book

https://www.developer.com/design/article.php/10925_3573356_3/Type-Design-Guidelines-for-Reusable-NET-Libraries.htm.

Is there some way I can do that? Make a base class that no one can easily descend from, except some sort of "approved" implementing assembly?

You can't do it with an interface, as anyone can implement the interface. But you can do it with an abstract class by making the constructor internal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I get the point about interfaces vs. base classes. If the class would only have abstract methods (as these would), I'm not sure I see how the problem is any better with base classes. And to the extent that you can "solve" it by introducing implementations of methods you introduce the fragile base class problem.

internal from what I understand works only for restricting access to the same assembly. This certainly doesn't cover this case, since the entire point is that the implementation should be in a different assembly. Anything based on signed DLLs, perhaps?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the class would only have abstract methods (as these would), I'm not sure I see how the problem is any better with base classes.

The point is, you can add members (properties, non-abstract methods) to a class without being a breaking change. New methods can be virtual, and your implementation class override them. And the abstract class can no-op, throw, return a default value, etc.
Once you ship an interface, you can never add new members to it.

Now in this case, if you restrict the class so that ONLY you can inherit from it (like using an internal constructor), then you can also add abstract methods to the base class in future versions. Because you know that no one else can inherit from it.

internal from what I understand works only for restricting access to the same assembly. This certainly doesn't cover this case, since the entire point is that the implementation should be in a different assembly.

You can use InternalsVisibleTo to get around this, if you want.

/// That method creates a with inputs and outputs, but this object can modify the node further
/// by adding attributes (in ONNX parlance, attributes are more or less constant parameterizations).
/// </summary>
public interface IOnnxNode
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar question as IOnnxContext. Would it be better if this was an abstract class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe?

@TomFinley
Copy link
Contributor Author

All right, I think I did it correctly, let me know. But I'm not certain. How can I test it out? So if I just run build.cmd everything builds successfully, but then again if I run it after introducing deliberate errors into the nupkgproj or the project file to include in some non-existent package it also builds without complaint.

I guess I've never tried to build the nugets directly from here, not sure if there's some other script other than build.cmd I need to do that.


In reply to: 401915564 [](ancestors = 401915564)

@eerhardt
Copy link
Member

eerhardt commented Jul 2, 2018

I guess I've never tried to build the nugets directly from here, not sure if there's some other script other than build.cmd I need to do that.

See https://github.com/dotnet/machinelearning/blob/master/docs/building/windows-instructions.md#building-from-the-command-line

  • build.cmd - builds the assemblies
  • build.cmd -runTests - builds the assemblies and runs tests
  • build.cmd -buildPackages called after a normal “build.cmd” will create the NuGet packages with the assemblies in “bin"

@TomFinley
Copy link
Contributor Author

Ah nice. I feel like I should have known that. Yes seems to work. Thanks Eric!


In reply to: 401961587 [](ancestors = 401961587)

@TomFinley
Copy link
Contributor Author

@dotnet-bot test OSX10.13 Release

@TomFinley TomFinley force-pushed the tfinley/OnnxIsolation branch 2 times, most recently from f25025a to 4851898 Compare July 3, 2018 15:02

<PropertyGroup>
<TargetFramework>netstandard2.0</TargetFramework>
<PackageDescription>ML.NET component for Exporting ONNX Models</PackageDescription>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we envision using this package/library for more than just exporting an ONNX model? For example, if I want to load/execute an ONNX model, would that be through a separate library?

(nit) does Exporting need to be capitalized?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we envision using this package/library for more than just exporting an ONNX model? For example, if I want to load/execute an ONNX model, would that be through a separate library?

I'm not certain. If I had to guess, this hypothetical execution would involve linking to some native library in some fashion (Lotus?). I don't see much opportunity for code reuse in that case of course, so that suggests a different library.

does Exporting need to be capitalized?

No.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. We can adjust the package description later, if we add other functionality.

/// <param name="name">The name of the operator, which ought to be something returned from <see cref="OnnxContext.GetNodeName(string)"/></param>
/// <param name="domain">The domain of the ONNX operator, if non-default</param>
/// <returns>A node added to the in-progress ONNX graph, that attributes can be set on</returns>
public static OnnxNode CreateNode(this OnnxContext ctx,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this to be an extension method anymore? Can't it just be a regular method on OnnxContext?

TomFinley and others added 2 commits July 3, 2018 09:27
* Improve ML.NET version handling to actually rely on the assembly vs. a constant string we have to change.
* Tighten checks of arguments to the context implementation.
/// given to a component, all other components up to that component have already attempted to express themselves in
/// this context, with their outputs possibly available in the ONNX graph.
///
///
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you probably don't need this lines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not. But you never know.


In reply to: 199963584 [](ancestors = 199963584)

}
}

string columnName = _columnNameMap.Single(kvp => string.Compare(kvp.Value, variableName) == 0).Key;
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Jul 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

string.Compare(kvp.Value, variableName) == 0 [](start = 61, length = 44)

curious, why you prefer string.Compare to just '==' #Resolved

Copy link
Contributor Author

@TomFinley TomFinley Jul 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't, not quite sure what happened here. I'm fairly certain I wouldn't have written that...


In reply to: 199965797 [](ancestors = 199965797)

Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@TomFinley
Copy link
Contributor Author

@dotnet-bot test Linux Debug

Copy link
Member

@eerhardt eerhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@TomFinley TomFinley merged commit 52cc874 into dotnet:master Jul 5, 2018
@TomFinley TomFinley deleted the tfinley/OnnxIsolation branch July 5, 2018 20:07
eerhardt pushed a commit to eerhardt/machinelearning that referenced this pull request Jul 27, 2018
Abstraction of ONNX exporting to interfaces, and isolation of actual implementation to separate DLL. Creation of a new NuGet to isolate Protobuf dependency.
@ghost ghost locked as resolved and limited conversation to collaborators Mar 30, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants