-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Export to ONNX and cross-platform command-line tool to script ML.NET training and inference #248
Changes from 16 commits
1b78603
1508cd4
c6763f3
b04ef49
eaa8e3a
de02e1e
6f4434e
6c97416
534fcd1
caebed0
692b526
6c92033
e466157
5a04795
8ef9b3f
64cdb80
3276dd3
c6bc1c6
17a738a
5090d24
4cfed38
0a13ad7
5bea824
2ab729f
faf528c
3dfd81f
1865825
3562331
d20f1a4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
// Licensed to the .NET Foundation under one or more agreements. | ||
// The .NET Foundation licenses this file to you under the MIT license. | ||
// See the LICENSE file in the project root for more information. | ||
|
||
namespace Microsoft.ML.Runtime.Tools.Console | ||
{ | ||
public static class Console | ||
{ | ||
public static int Main(string[] args) | ||
{ | ||
string all = string.Join(" ", args); | ||
return Maml.MainAll(all); | ||
} | ||
|
||
public static unsafe int MainRaw(char* psz) | ||
{ | ||
string args = new string(psz); | ||
return Maml.MainAll(args); | ||
} | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
<Project Sdk="Microsoft.NET.Sdk"> | ||
|
||
<PropertyGroup> | ||
<AllowUnsafeBlocks>true</AllowUnsafeBlocks> | ||
<DefineConstants>CORECLR</DefineConstants> | ||
<IncludeInPackage>Microsoft.ML</IncludeInPackage> | ||
<TargetFramework>netcoreapp2.0</TargetFramework> | ||
<OutputType>Exe</OutputType> | ||
<StartupObject>Microsoft.ML.Runtime.Tools.Console.Console</StartupObject> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This would be the actual thing we'd do There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Startup object contains the static entrypoint function for the standalone executable. You would do on Microsoft.ML.Runtime.Tools.Console.dll, are you saying we should generate mml.dll instead? In reply to: 192227732 [](ancestors = 192227732) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, in things intended to be invoked from the command line I'd prefer something shorter. We're coming up with a replacement for In reply to: 192240765 [](ancestors = 192240765,192227732) |
||
</PropertyGroup> | ||
|
||
<ItemGroup> | ||
<ProjectReference Include="..\Microsoft.ML.Core\Microsoft.ML.Core.csproj" /> | ||
<ProjectReference Include="..\Microsoft.ML.Maml\Microsoft.ML.Maml.csproj" /> | ||
</ItemGroup> | ||
|
||
</Project> |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -9,13 +9,17 @@ | |
using Microsoft.ML.Runtime.Command; | ||
using Microsoft.ML.Runtime.CommandLine; | ||
using Microsoft.ML.Runtime.Data; | ||
using Microsoft.ML.Runtime.EntryPoints; | ||
using Microsoft.ML.Runtime.Internal.Utilities; | ||
using Microsoft.ML.Runtime.Model.Onnx; | ||
using Microsoft.ML.Runtime.UniversalModelFormat.Onnx; | ||
using Newtonsoft.Json; | ||
|
||
[assembly: LoadableClass(SaveOnnxCommand.Summary, typeof(SaveOnnxCommand), typeof(SaveOnnxCommand.Arguments), typeof(SignatureCommand), | ||
"Save ONNX", "SaveOnnx", DocName = "command/SaveOnnx.md")] | ||
|
||
[assembly: LoadableClass(typeof(void), typeof(SaveOnnxCommand), null, typeof(SignatureEntryPointModule), "SaveOnnx")] | ||
|
||
namespace Microsoft.ML.Runtime.Model.Onnx | ||
{ | ||
public sealed class SaveOnnxCommand : DataCommand.ImplBase<SaveOnnxCommand.Arguments> | ||
|
@@ -37,14 +41,24 @@ public sealed class Arguments : DataCommand.ArgumentsBase | |
[Argument(ArgumentType.AtMostOnce, HelpText = "The 'domain' property in the output ONNX.", NullName = "<Auto>", SortOrder = 4)] | ||
public string Domain; | ||
|
||
[Argument(ArgumentType.AtMostOnce, HelpText = "Comma delimited list of input column names to drop", ShortName = "idrop", SortOrder = 5)] | ||
[Argument(ArgumentType.AtMostOnce, Visibility = ArgumentAttribute.VisibilityType.CmdLineOnly, HelpText = "Comma delimited list of input column names to drop", ShortName = "idrop", SortOrder = 5)] | ||
public string InputsToDrop; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Set visibility on this. #Closed |
||
|
||
[Argument(ArgumentType.AtMostOnce, HelpText = "Comma delimited list of output column names to drop", ShortName = "odrop", SortOrder = 6)] | ||
[Argument(ArgumentType.AtMostOnce, Visibility = ArgumentAttribute.VisibilityType.EntryPointsOnly, HelpText = "Array of input column names to drop", Name = nameof(InputsToDrop), SortOrder = 6)] | ||
public string[] InputsToDropArray; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Awkward name to use. Please use |
||
|
||
[Argument(ArgumentType.AtMostOnce, Visibility = ArgumentAttribute.VisibilityType.CmdLineOnly, HelpText = "Comma delimited list of output column names to drop", ShortName = "odrop", SortOrder = 7)] | ||
public string OutputsToDrop; | ||
|
||
[Argument(ArgumentType.AtMostOnce, HelpText = "Whether we should attempt to load the predictor and attach the scorer to the pipeline if one is present.", ShortName = "pred", SortOrder = 7)] | ||
[Argument(ArgumentType.AtMostOnce, Visibility = ArgumentAttribute.VisibilityType.EntryPointsOnly, HelpText = "Array of output column names to drop", Name = nameof(OutputsToDrop), SortOrder = 8)] | ||
public string[] OutputsToDropArray; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Same comments on inputs apply to outputs. #Closed |
||
|
||
[Argument(ArgumentType.AtMostOnce, Visibility = ArgumentAttribute.VisibilityType.CmdLineOnly, HelpText = "Whether we should attempt to load the predictor and attach the scorer to the pipeline if one is present.", ShortName = "pred", SortOrder = 9)] | ||
public bool? LoadPredictor; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is there a reason to have this in entry-point land? #Closed There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not if we are passing in ITransformModel that contains the predictor. In reply to: 191819078 [](ancestors = 191819078) |
||
|
||
[Argument(ArgumentType.Required, Visibility = ArgumentAttribute.VisibilityType.EntryPointsOnly, HelpText = "Model that needs to be converted to ONNX format.", SortOrder = 10)] | ||
|
||
public ITransformModel Model; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There appears to be a newline gap here. We tend to not put whitespace between the attribute and the field, as we see elsewhere here. #Closed |
||
} | ||
|
||
private readonly string _outputModelPath; | ||
|
@@ -54,6 +68,7 @@ public sealed class Arguments : DataCommand.ArgumentsBase | |
private readonly bool? _loadPredictor; | ||
private readonly HashSet<string> _inputsToDrop; | ||
private readonly HashSet<string> _outputsToDrop; | ||
private readonly ITransformModel _model; | ||
|
||
public SaveOnnxCommand(IHostEnvironment env, Arguments args) | ||
: base(env, args, LoadName) | ||
|
@@ -68,16 +83,18 @@ public SaveOnnxCommand(IHostEnvironment env, Arguments args) | |
_name = args.Name; | ||
|
||
_loadPredictor = args.LoadPredictor; | ||
_inputsToDrop = CreateDropMap(args.InputsToDrop); | ||
_outputsToDrop = CreateDropMap(args.OutputsToDrop); | ||
_inputsToDrop = CreateDropMap(args.InputsToDropArray ?? args.InputsToDrop?.Split(',')); | ||
_outputsToDrop = CreateDropMap(args.OutputsToDropArray ?? args.OutputsToDrop?.Split(',')); | ||
_domain = args.Domain; | ||
_model = args.Model; | ||
} | ||
|
||
private static HashSet<string> CreateDropMap(string toDrop) | ||
private static HashSet<string> CreateDropMap(string[] toDrop) | ||
{ | ||
if (string.IsNullOrWhiteSpace(toDrop)) | ||
if (toDrop == null) | ||
return new HashSet<string>(); | ||
return new HashSet<string>(toDrop.Split(',')); | ||
|
||
return new HashSet<string>(toDrop); | ||
} | ||
|
||
public override void Run() | ||
|
@@ -115,26 +132,34 @@ private void GetPipe(IChannel ch, IDataView end, out IDataView source, out IData | |
|
||
private void Run(IChannel ch) | ||
{ | ||
IDataLoader loader; | ||
IPredictor rawPred; | ||
RoleMappedSchema trainSchema; | ||
IDataLoader loader = null; | ||
IPredictor rawPred = null; | ||
IDataView view; | ||
RoleMappedSchema trainSchema = null; | ||
|
||
if (string.IsNullOrEmpty(Args.InputModelFile)) | ||
if (_model == null) | ||
{ | ||
loader = CreateLoader(); | ||
rawPred = null; | ||
trainSchema = null; | ||
Host.CheckUserArg(Args.LoadPredictor != true, nameof(Args.LoadPredictor), | ||
"Cannot be set to true unless " + nameof(Args.InputModelFile) + " is also specifified."); | ||
if (string.IsNullOrEmpty(Args.InputModelFile)) | ||
{ | ||
loader = CreateLoader(); | ||
rawPred = null; | ||
trainSchema = null; | ||
Host.CheckUserArg(Args.LoadPredictor != true, nameof(Args.LoadPredictor), | ||
"Cannot be set to true unless " + nameof(Args.InputModelFile) + " is also specifified."); | ||
} | ||
else | ||
LoadModelObjects(ch, _loadPredictor, out rawPred, true, out trainSchema, out loader); | ||
|
||
view = loader; | ||
} | ||
else | ||
LoadModelObjects(ch, _loadPredictor, out rawPred, true, out trainSchema, out loader); | ||
view = _model.Apply(Host, new EmptyDataView(Host, _model.InputSchema)); | ||
|
||
// Get the transform chain. | ||
IDataView source; | ||
IDataView end; | ||
LinkedList<ITransformCanSaveOnnx> transforms; | ||
GetPipe(ch, loader, out source, out end, out transforms); | ||
GetPipe(ch, view, out source, out end, out transforms); | ||
Host.Assert(transforms.Count == 0 || transforms.Last.Value == end); | ||
|
||
var ctx = new OnnxContext(Host, _name, _domain); | ||
|
@@ -228,10 +253,24 @@ private void Run(IChannel ch) | |
|
||
if (!string.IsNullOrWhiteSpace(Args.OutputModelFile)) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Did you set the visibility on these? I don't see the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These are not exposed in the EP since I created the arguments class for EP input. In reply to: 191821856 [](ancestors = 191821856) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure, but I got the sense you didn't want to do that. It means a fair amount of duplication that I think is probably unnecessary. In reply to: 191901052 [](ancestors = 191901052,191821856) |
||
{ | ||
Contracts.Assert(loader != null); | ||
|
||
ch.Trace("Saving the data pipe"); | ||
// Should probably include "end"? | ||
SaveLoader(loader, Args.OutputModelFile); | ||
} | ||
} | ||
|
||
public sealed class Output | ||
{ | ||
} | ||
|
||
[TlcModule.EntryPoint(Name = "Models.OnnxConverter", Desc = "Converts the model to ONNX format.", UserName = "ONNX Converter.")] | ||
public static Output Apply(IHostEnvironment env, Arguments input) | ||
{ | ||
new SaveOnnxCommand(env, input).Run(); | ||
return new Output(); | ||
} | ||
|
||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is
MainRaw
onchar*
a thing? I've not quite seen this pattern before. What is this for? #ClosedThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is present in MAML.cs at line 38. I'm not sure why it was written but I assume this is probably an entry point for some scenario. Should we remove it?
In reply to: 192227331 [](ancestors = 192227331)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, these came from MAML.cs. I see that now.
The sole purpose of this DLL is to enable invocations of
dotnet
to work. That's the only reason it exists. So drop thechar*
Also: at the risk of stating the obvious, let's not copy functions of assemblies we are referencing, let's call them.
The only thing I'd like to see in this class is
public static int Main(string[] args) => Maml.Main(args);
, and that is it.In reply to: 192462330 [](ancestors = 192462330,192227331)