Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace DvBool with .NET standard type. #692

Closed
wants to merge 20 commits into from
Closed

Conversation

codemzs
Copy link
Member

@codemzs codemzs commented Aug 18, 2018

fixes #673

@codemzs codemzs changed the title Replace DvBool with .NET standard for Boolean type. Replace DvBool with .NET standard type. Aug 19, 2018
{
Ch.Assert(colType.ItemType.IsBool);
return CreateConvertingArrayGetterDelegate<bool?, DvBool>(index, x => x ?? DvBool.NA);
return CreateConvertingArrayGetterDelegate<bool, bool>(index, x => x);
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Aug 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bool [](start = 77, length = 4)

you don't have to do it.
just fall back to default T[] -> VBuffer code. #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

of course.


In reply to: 211109130 [](ancestors = 211109130)

Ch.Assert(colType.IsBool);
return CreateConvertingGetterDelegate<bool?, DvBool>(index, x => x ?? DvBool.NA);
return CreateConvertingGetterDelegate<bool, bool>(index, x => x);
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Aug 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return CreateConvertingGetterDelegate<bool, bool>(index, x => x); [](start = 27, length = 66)

not needed, fall back to T-> T default #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep.


In reply to: 211109149 [](ancestors = 211109149)

{
Ch.Assert(colType.ItemType.IsBool);
return CreateConvertingVBufferSetter<DvBool, bool?>(input, index, poke, peek, x => (bool?)x);
return CreateConvertingVBufferSetter<bool, bool>(input, index, poke, peek, x => x);
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Aug 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fall back to // VBuffer -> T[] #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is correct.


In reply to: 211109159 [](ancestors = 211109159)

Ch.Assert(colType.IsBool);
Ch.Assert(peek == null);
return CreateConvertingActionSetter<DvBool, bool?>(input, index, poke, x => (bool?)x);
return CreateConvertingActionSetter<bool, bool>(input, index, poke, x => x);
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Aug 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CreateConvertingActionSetter [](start = 31, length = 28)

fall back to // T -> T #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

falling back ...


In reply to: 211109169 [](ancestors = 211109169)

@@ -11,10 +11,11 @@
using System.Text;
using System.Threading;
using Microsoft.ML.Runtime.Internal.Utilities;
using System.Runtime.CompilerServices;
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Aug 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorting #Resolved

dst.Append("0");
else if (src.IsTrue)
else if (src)
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Aug 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (src) [](start = 17, length = 8)

this if looks redundant #Resolved

@@ -1826,8 +1821,8 @@ public void Convert(ref TX span, ref DZ value)
public void Convert(ref BL src, ref I2 dst) => dst = (I2)src;
public void Convert(ref BL src, ref I4 dst) => dst = (I4)src;
public void Convert(ref BL src, ref I8 dst) => dst = (I8)src;
public void Convert(ref BL src, ref R4 dst) => dst = (R4)src;
public void Convert(ref BL src, ref R8 dst) => dst = (R8)src;
public void Convert(ref BL src, ref R4 dst) => dst = System.Convert.ToSingle(src);
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Aug 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

System [](start = 61, length = 6)

you have using System; in beginning of file, is this System necessary? #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, it doesn't seem to recognize otherwise.


In reply to: 211109280 [](ancestors = 211109280)

@@ -596,7 +596,7 @@ public override void AttachMetadata(MetadataDispatcher.Builder bldr, ColumnType
Host.Check(typeSrc.RawType == typeof(TFloat));
bldr.AddPrimitive("CdfMean", typeSrc, Mean);
bldr.AddPrimitive("CdfStdDev", typeSrc, Stddev);
bldr.AddPrimitive("CdfUseLog", BoolType.Instance, (DvBool)UseLog);
bldr.AddPrimitive("CdfUseLog", BoolType.Instance, (bool)UseLog);
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Aug 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(bool) [](start = 70, length = 6)

not necessary, UseLog already bool #Resolved

@@ -625,7 +625,7 @@ public override void AttachMetadata(MetadataDispatcher.Builder bldr, ColumnType
Host.Check(typeSrc.ItemType.RawType == typeof(TFloat));
bldr.AddGetter<VBuffer<TFloat>>("CdfMean", typeSrc, MeanMetadataGetter);
bldr.AddGetter<VBuffer<TFloat>>("CdfStdDev", typeSrc, StddevMetadataGetter);
bldr.AddPrimitive("CdfUseLog", BoolType.Instance, (DvBool)UseLog);
bldr.AddPrimitive("CdfUseLog", BoolType.Instance, (bool)UseLog);
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Aug 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bool [](start = 71, length = 4)

not necessary, UseLog already bool #Resolved

@@ -368,8 +368,8 @@ private void FillValues(int srcLength, ref VBuffer<DvBool> dst, List<int> indice

int ii = 0;
// Assigns values correctly depending on the sense.
DvBool hit = sense ? DvBool.True : DvBool.False;
DvBool miss = sense ? DvBool.False : DvBool.True;
bool hit = sense;
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Aug 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bool hit = sense; [](start = 16, length = 17)

is it necessary? can you just use sense instead of hit? (I'm totally fine with miss) #Resolved

@@ -10,60 +10,6 @@ namespace Microsoft.ML.Runtime.RunTests
{
public sealed class DvTypeTests
{
[Fact]
public void TestComparableDvInt4()
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Aug 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestComparableDvInt4 [](start = 20, length = 20)

why you removing whole test? Is DvInt4 gone already or in this PR? #Resolved

Copy link
Member Author

@codemzs codemzs Aug 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is another PR that removes DvInt* #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its gone in DvInt4 PR.


In reply to: 211109511 [](ancestors = 211109511)

@@ -309,7 +309,6 @@ public class ConversionNullalbeClass
public ulong? fuLong;
public float? fFloat;
public double? fDouble;
public bool? fBool;
Copy link
Contributor

@Ivanidzo4ka Ivanidzo4ka Aug 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

public bool? fBool; [](start = 12, length = 19)

This kinda make me sad.
I understand what we want to have pure framework, and as workaround we force people to cast everything to float which support NaNs, it just in my life I worked with Something2SQL frameworks, and quite often you have nullable boolean columns, and if they fetch through DTO object, it will make that object uncastable /unwrappable for our collection to dataview.
Which will lead to bunch of painful casting/objects recreations.
But again, what do I know about life. #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, read Tom's comment on the issue page you might become happy! :)


In reply to: 211109825 [](ancestors = 211109825)

@Ivanidzo4ka
Copy link
Contributor

Ivanidzo4ka commented Aug 20, 2018

Can you link issue which trigger this PR? #Resolved

@danmoseley
Copy link
Member

danmoseley commented Aug 20, 2018

Are you removing DvBool.cs in a followup PR? #Resolved

}

[Fact]
public void NullableBooleanLabelPipeline()
Copy link
Member

@eerhardt eerhardt Aug 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we decided we wanted to support bool?? #Resolved

Copy link
Member Author

@codemzs codemzs Aug 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a compelling reason?

CC: @TomFinley #Resolved

Copy link
Member

@eerhardt eerhardt Aug 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, Tom, didn't think there is a compelling reason to support bool? But if you can think of something then let me know. There were two places where bool? was being used in the codebase and they were both in prediction label from score, i.e if score was null then assign predicted label as NA but we defaulted to false for missing and you can see the change in test results. Tom is ok with this.


In reply to: 211434928 [](ancestors = 211434928)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, this is from nullable stuff that comes in from sql tables, we don;t need to support this, please see Tom;s comment on the issue comment section.


In reply to: 211678395 [](ancestors = 211678395)

@codemzs
Copy link
Member Author

codemzs commented Aug 26, 2018

Yes.


In reply to: 414418364 [](ancestors = 414418364)

@@ -63,8 +63,7 @@ public CodecFactory(IHostEnvironment env, MemoryStreamPool memPool = null)

// Register the old boolean reading codec.
var oldBool = new OldBoolCodec(this);
RegisterOtherCodec(oldBool.LoadName, oldBool.GetCodec);

RegisterOtherCodec("DvBool", oldBool.GetCodec);
Copy link
Member

@eerhardt eerhardt Aug 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we removed DvBool. ? #Resolved

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its for backward compatibility, incase someone saves a dataset in IDV format with the old DvTypes and loads in ML.NET without DvTypes.


In reply to: 213403080 [](ancestors = 213403080)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We aren't doing that for the other Dv types.... Why are we doing it for DvBool?


In reply to: 213408068 [](ancestors = 213408068,213403080)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm doing for DvDate/Time as well. For DvInt we don't need to change because the key for codec has not changed. The key is used to retrieve the codec to parse when the loader reads the signature string.

DvBool is the only codec where we have also modified the codec as well to use one bit to write a boolean value as opposed to two bits. We need the old codec to read values from IDV that was written using DvBool.


In reply to: 213420305 [](ancestors = 213420305,213408068,213403080)

Copy link
Member

@eerhardt eerhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@shauheen shauheen added the enhancement New feature or request label Aug 28, 2018
@shauheen shauheen added this to the 0818 milestone Aug 28, 2018
@shauheen shauheen removed this from the 0818 milestone Aug 31, 2018
…to dvbool

# Conflicts:
#	test/BaselineOutput/SingleDebug/SavePipe/TestParquetPrimitiveDataTypes-Data.txt
#	test/BaselineOutput/SingleDebug/SavePipe/TestParquetPrimitiveDataTypes-Schema.txt
#	test/BaselineOutput/SingleRelease/SavePipe/TestParquetPrimitiveDataTypes-Data.txt
#	test/BaselineOutput/SingleRelease/SavePipe/TestParquetPrimitiveDataTypes-Schema.txt
#	test/data/Parquet/alltypes.parquet
…to dvbool

# Conflicts:
#	src/Microsoft.ML.Data/Transforms/NormalizeTransform.cs
@codemzs
Copy link
Member Author

codemzs commented Sep 20, 2018

This change was committed via #863

@codemzs codemzs closed this Sep 20, 2018
@codemzs codemzs deleted the dvbool branch September 20, 2018 18:17
@ghost ghost locked as resolved and limited conversation to collaborators Mar 29, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

.NET data type system instead of DvTypes
5 participants