Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Smoke test for F# #600

Closed
wants to merge 5 commits into from
Closed

Smoke test for F# #600

wants to merge 5 commits into from

Conversation

dsyme
Copy link
Contributor

@dsyme dsyme commented Jul 30, 2018

This addresses the first part of #540, i.e. adding an initial smoke test for ML.NET and F#

Submitting now to check that CI passes ok, will then iterate.

@dsyme
Copy link
Contributor Author

dsyme commented Jul 30, 2018

I got this error initially on OSX

FSC : error FS2014: A problem occurred writing the binary '/Users/dotnet-bot/j/w/dotnet_machinelearning/master/osx10.13_debug_prtest/bin/obj/AnyCPU.Debug/Microsoft.ML.FSharp.Tests/netcoreapp2.0/Microsoft.ML.FSharp.Tests.dll': A call to StrongNameSignatureSize failed (Invalid Public Key blob) 

This is due to an F# compiler issue when using /publicsign+ (i.e. think the problem is when using a test key that only containing a public key token. Or something like that). I'll add an F# compiler bug

In the meantime we work around by not using /publicsign for the F# test assembly (since it's only a test assembly it doesn't matter either way)

@dsyme dsyme changed the title [WIP] smoke test for F# Smoke test for F# Jul 30, 2018
@dsyme
Copy link
Contributor Author

dsyme commented Jul 30, 2018

This is ready for review etc.

Felt like a lot of work for +196 lines. But still, there it is

@dsyme dsyme added the F# Support of F# language label Jul 30, 2018
Copy link
Member

@eerhardt eerhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work, @dsyme. This is good that we are getting F# test coverage in our repo.

In order to get your new test running, you'll need to add a line here:

<Project Include="$(MSBuildThisFileDirectory)**\*.csproj" />

<PropertyGroup>
<TargetFrameworks>netcoreapp2.0</TargetFrameworks>
<NoWarn>2003;$(NoWarn)</NoWarn>
<TargetFrameworks Condition="'$(OS)' != 'Unix'">$(TargetFrameworks); net461</TargetFrameworks>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) I'd keep this line next to the other TargetFrameworks declaration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

</ItemGroup>

<ItemGroup>
<PackageReference Include="Microsoft.NET.Test.Sdk" Version="15.7.0" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These 3 PackageReferences are not needed, since all projects under the test directory automatically include them.

<PackageReference Include="Microsoft.NET.Test.Sdk" Version="15.5.0" />
<PackageReference Include="xunit" Version="2.3.1" />
<PackageReference Include="xunit.runner.visualstudio" Version="2.3.1" />

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, thanks!

<PackageReference Include="Microsoft.NET.Test.Sdk" Version="15.7.0" />
<PackageReference Include="xunit" Version="2.3.1" />
<PackageReference Include="xunit.runner.visualstudio" Version="2.3.1" />
<DotNetCliToolReference Include="dotnet-xunit" Version="2.3.1" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line can be removed all together, since it is not needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okey dokey

[<Fact>]
let ``FSharp-Sentiment-Smoke-Test`` () =

let testDataPath = __SOURCE_DIRECTORY__ + @"/../data/wikipedia-detox-250-line-data.tsv"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does __SOURCE_DIRECTORY__ work in non-interactive mode?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it does

<TargetFrameworks Condition="'$(OS)' != 'Unix'">$(TargetFrameworks); net461</TargetFrameworks>
<NoWarn>2003;$(NoWarn)</NoWarn>
<PublicSign>false</PublicSign>
<SourceLink></SourceLink>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this for? Is this necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just got a build error when building from the IDE using VS 15.7 because for .NET 4.6.1 compilation "full" PDB symbols (not portable) were being produced), I can reproduce if I build with msbuild not dotnet. Full PDB are not compatible with sourcelink. It doesn't go away even if you set DebugType to "portable".

I think it is an F# build targets bug fixed in 15.8. Anyway for now seemed sensible to turn of SourceLink for the test project

Copy link
Member

@eerhardt eerhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@dsyme
Copy link
Contributor Author

dsyme commented Jul 30, 2018

OK, test is failing now it is running, oddly it's not failing on my machine. Will look into it

2018-07-30T19:44:52.3992357Z Failed   Microsoft.ML.FSharp.Tests.SmokeTest1.FSharp-Sentiment-Smoke-Test
2018-07-30T19:44:52.3992813Z Error Message:
2018-07-30T19:44:52.3993183Z  System.InvalidOperationException : Entry point 'Transforms.TextFeaturizer' not found
2018-07-30T19:44:52.3993578Z Stack Trace:
2018-07-30T19:44:52.3994281Z    at Microsoft.ML.Runtime.EntryPoints.EntryPointNode..ctor(IHostEnvironment env, IChannel ch, ModuleCatalog moduleCatalog, RunContext context, String id, String entryPointName, JObject inputs, JObject outputs, Boolean checkpoint, String stageId, Single cost, String label, String group, String weight, String name)
2018-07-30T19:44:52.3994761Z    at Microsoft.ML.Runtime.EntryPoints.EntryPointNode.ValidateNodes(IHostEnvironment env, RunContext context, JArray nodes, ModuleCatalog moduleCatalog, String label, String group, String weight, String name)
2018-07-30T19:44:52.3995085Z    at Microsoft.ML.Runtime.EntryPoints.EntryPointGraph..ctor(IHostEnvironment env, ModuleCatalog moduleCatalog, JArray nodes)
2018-07-30T19:44:52.3995384Z    at Microsoft.ML.Runtime.Experiment.Compile()
2018-07-30T19:44:52.3995640Z    at Microsoft.ML.LearningPipeline.Train[TInput,TOutput]()
2018-07-30T19:44:52.3995938Z    at Microsoft.ML.FSharp.Tests.SmokeTest1.FSharp-Sentiment-Smoke-Test()

At least that's proof that it's running

@dsyme
Copy link
Contributor Author

dsyme commented Jul 31, 2018

@eerhardt Is there any ongoing discussion about the design of ML.NET's Component Catalog mechanism, e.g. is there any prospect of scrapping it in favour of more traditional direct static library references? What's the driving motivation of the mechanism?

I find it strange that we would have this extra level of indirection and it seems to be causing problems for F# - in scripts we need the artificial stuff mentioned in #401. The need for that kind of thing is always indicative of a problem (over over-complexification) in the underlying framework - "normal" .NET libraries don't need this sort of thing.

Also, the component catalog seems to be populated via the assemblies reachable via direct static references (I'm not totally sure about that, but that's kind of the point - the mechanism is just obscure). It's too easily possible to write code where there there isn't a static reference to, say Microsoft.ML.Transforms. For example prior to adding 4e131b5 there was no static reference to Microsoft.ML.Transforms in the F# unit test DLL (the F# compiler only emits the static references that are actually used in the output DLL, I think the same is true of C#). I think (not yet sure) that was the cause of the unit test failure above.

Basically, what is the component catalog really for, and is it really necessary? It's always going to sit very awkwardly alongside other .NET componentization models (including the traditional one of static references) - e.g. using it with DI frameworks might be really rough. Why isn't ML.NET just a normal set of .NET libraries using normal static assembly references?

@dsyme dsyme mentioned this pull request Jul 31, 2018
2 tasks
@Ivanidzo4ka
Copy link
Contributor

#208 is probably closest one. But it doesn't answer question why we have it in first place.

I can give you one example where existence of component catalog make sense (at least for me).

Simple scoring.
You got model trained somewhere and all you want is just run it.

With current code, you just do

model.Read("model.zip");
model.Predict(data);

and all you need to do is add nuget package and it will work. (at least that was the case for .net framework 4.5 where content of nuget files was put into you binary folder, in net standard we need to make changes in our component catalog)

This way you don't have to add references on libraries in your project.


In reply to: 409231443 [](ancestors = 409231443)

@eerhardt
Copy link
Member

eerhardt commented Jul 31, 2018

Yes, there has been discussion here: #371. Basically, the plan of record is that we will move to directly calling the underlying C# APIs without going through the extra layer of indirection. That will solve the majority of the issues mentioned in #401. (Note, we have the same issue in Azure Functions - #559)

However, that doesn't completely solve the issue of dependency injection. The other place where is must be used is when we load a .zip file (i.e. a "model" file) from disk. Someone could create a model file that used EricsCoolLearner from EricsCool.dll, and you can write an app that tries to load that model. However, you don't necessarily have a direct/static library reference on EricsCool.dll. (Or even if you did have the reference, you might not have used a type in that assembly yet, so it may not be loaded.)

So, in this scenario, we are still going to need dependency injection at runtime. That's the main usage scenario for component catalog. (It is also used by other tooling, like the MAML command line tool, and the Entry Points subsystem.)

In investigating the Azure Functions issue, we discovered other issues with the way that the component catalog loads assemblies. At its heart, the issue is that we don't work well with custom AssemblyLoadContext instances. That issue is being tracked at #569.

We will also have to change how the code that only looks in the current folder for components. And instead, we should probably be using the DependencyContext when running on .NET Core (similar to how ASP.NET MVC works to load all the controller classes).

@dsyme
Copy link
Contributor Author

dsyme commented Aug 2, 2018

Was integrated in #616

@dsyme dsyme closed this Aug 2, 2018
@ghost ghost locked as resolved and limited conversation to collaborators Mar 29, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
F# Support of F# language
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants