Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API to return ColumnPairs for a OneToOneTransformer. #3088

Merged
merged 3 commits into from
Mar 26, 2019

Conversation

codemzs
Copy link
Member

@codemzs codemzs commented Mar 25, 2019

fixes #3087

@codemzs codemzs changed the title API to return ColumnPairs for a transformer. API to return ColumnPairs for a OneToOneTransformer. Mar 25, 2019
@codecov
Copy link

codecov bot commented Mar 26, 2019

Codecov Report

Merging #3088 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #3088      +/-   ##
==========================================
+ Coverage   72.51%   72.52%   +<.01%     
==========================================
  Files         807      808       +1     
  Lines      144642   144659      +17     
  Branches    16197    16198       +1     
==========================================
+ Hits       104890   104908      +18     
  Misses      35341    35341              
+ Partials     4411     4410       -1
Flag Coverage Δ
#Debug 72.52% <100%> (ø) ⬆️
#production 68.12% <100%> (ø) ⬆️
#test 88.81% <100%> (-0.01%) ⬇️
Impacted Files Coverage Δ
...soft.ML.Data/Transforms/OneToOneTransformerBase.cs 95.08% <ø> (ø) ⬆️
....Experimental/OneToOneTransformerBaseExtensions.cs 100% <100%> (ø)
.../Microsoft.ML.Data/Transforms/ExtensionsCatalog.cs 95.23% <100%> (-0.22%) ⬇️
...Microsoft.ML.Tests/Transformers/NormalizerTests.cs 100% <100%> (ø) ⬆️
...soft.ML.TestFramework/DataPipe/TestDataPipeBase.cs 73.7% <0%> (-0.34%) ⬇️
...ML.Transforms/Text/StopWordsRemovingTransformer.cs 86.1% <0%> (-0.16%) ⬇️
...soft.ML.Data/DataLoadSave/Text/TextLoaderCursor.cs 84.7% <0%> (ø) ⬆️
...StandardTrainers/Standard/LinearModelParameters.cs 60.31% <0%> (+0.26%) ⬆️
src/Microsoft.ML.Transforms/Text/LdaTransform.cs 89.89% <0%> (+0.62%) ⬆️

@codemzs codemzs requested a review from wschin March 26, 2019 00:27
Copy link
Contributor

@artidoro artidoro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure to also double check with @TomFinley about this please.

Copy link
Member

@wschin wschin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚢

Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@codemzs
Copy link
Member Author

codemzs commented Mar 26, 2019

@TomFinley Please leave any comments you may have and I will address them in another PR. I have sign-offs from @artidoro, @wschin and @Ivanidzo4ka for now.

CC: @artidoro

@codemzs codemzs merged commit a18be69 into dotnet:master Mar 26, 2019
@TomFinley
Copy link
Contributor

TomFinley commented Mar 26, 2019

@TomFinley Please leave any comments you may have and I will address them in another PR. I have sign-offs from @artidoro, @wschin and @Ivanidzo4ka for now.

Hi @codemzs and @artidoro, I am replying here with my opinions. Short reaction is, I think this is a good change, and entirely in the spirit of what I intended to be done with the experimental nuget, pursuant to #2279, and I am happy where it ended up. I see no reason for a new PR at this time.

One complaint: It would have been a bit cleaner to just make an internal method on OneToOneTransformerBase directly that returns these pairs, and just call that method simply. The way in which it was actually written (to make the field public, then operate on that field across the assembly) represents a needless breaking of the principle of encapsulation. I mean, there was no point to it, and now the class is significantly harder to understand. (Every time you make a central structure in a class more broadly accessible, you make that class harder to comprehend.)

Aside from breaking encapsulation, looking at it from a more general perspective, if we consider that the experimental API represents candidates for things that might become public, it is fair cleaner if the internal method in the stable code mirrors precisely the public extension method on the experimental code. That way, if we decide, "you know what, this should be public," it's just a matter of deleting the extension method then changing the internal to public. So that represents a simpler transition.

But, these are things that can be fixed later, as they deal with internal structures. No urgency.

Beyond that, in side communications I sensed concern that this might not be enough information, that the return type of this input/output pair is not right. My answer is: yes, you're probably right, but I've heard so many requests for information out of ITransformers (ranging the gamut from the reasonable, as this is, to the ridiculous), that I start to suspect an API that could make everyone happy is impossible. I suspect that where we wind up will be something a bit more sophisticated than this, but what the shape of that will look like, I have no idea. So, given that I have no idea, I'm happy to start with something simple and easy to do (as @codemzs has done here) made to address a very specific, limited, reasonable request for information meant to address a specific application, so we and they at least have a starting point. And this sort of working in a 'new area' is precisely why we have this experimental project, and other experimental projects! Yes the types will change, no doubt, but that's why it's in experimental I think.

So, I think it's good.

@ghost ghost locked as resolved and limited conversation to collaborators Mar 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

API to return ColumnPairs for a transformer.
5 participants