cid implementation variations++ #72

warpfork · 2018-08-27T23:10:16Z

Enumerating another option -- "Option F" -- for type struct{string}.

I originally got nerdsniped into looking at this because of a comment elsewhere; it's also what the tip of #71 is doing. Definitely deserves to be uniquely identified as an option.

I didn't bother to write a whole new implementation in this package of this thing, because it would be almost identical to the string typedef impl... but the exploratory inspection code I came up with in the playground is easy to keep in the lab notebook, so here it is.

It's also viable. Options list expanded. (And regretting my ordering of it now. Wish I'd thought of this one and realized it's distinct earlier.)

Stebalien · 2018-08-27T23:14:28Z

_rsrch/cidiface/README.md

+
+Option F (a struct wrapper) means we can prevent casting into our Cid type.
+Option B (typedef string) can be declared a `const`.
+Are there any other concerns that would separate the two choices?


The struct wrapper allows us to add additional metadata/offsets later if we want.

That's "Option D" or "Option E" though, respectively.

Tbh, I think basically all of these are equal in terms of how easy they are to switch to any of the others, later. Switching from typedef-string to struct later would be a "breaking change" in terms of how a linker things about the raw assembly... but it will not be a breaking change syntactically‡, and that's what matters to us as humans estimating refactor costs.

[‡] Presuming we make all accesses outside the cid package itself via public functions, and define a cid.Zero. Which would be the plan, I think. The only thing that makes this first big refactor a pain is the syntactic change of removing the * and fixing all the == nil occurances.

People would still compare against "" and cast to a string/bytes.

The former should be rejected in code review based on the existence of cid.Zero and introduced universally when we do the refactor; the latter would be a bug.

If we're seriously worried about these, then we should 100% hands-down prefer Option F over B, because more than precisely zero instances of casting to Cid from outside the cid package would strike me as acceptable code quality.

The former should be rejected in code review based on the existence of cid.Zero and introduced universally when we do the refactor; the latter would be a bug.

That's not an assumption we can make, generally. Users (not in our projects) will cast. It's really the primary reason to use a type alias over a struct.

For the choice of the constant I am going with Nil. My second choice would be Empty, and Zero the third. Nil to me is neutral and also implies an invalid value, of which an empty Cid{} struct is (it can't be used for anything). Empty to me implies a valid but empty container, but I am okay with it if Nil is causing that much of a problem. Zero to me implies a numeric value, of which a Cid is not. I understand the notion of a zero-value in compiler terminology, but Zero still strikes me as an odd choice, ZeroValue may be a bit better, but if Nil is no good I still rather go with Empty.

I didn't realize that using struct initializers in the middle of equality was so verbose. However, it was never my intent to use them in equality. Instead I provided an IsNil method as c.IsNil() is nicer than c == cid.Nil or c == (cid.Cid{}). However elected to still provide var Nil = Cid{} so it can be used if desired.

If we're going to make speculations about whether comparison to a var is going to be a performance issue, then shouldn't proposing an equality method have at least as much burden to prove itself a non-issue before we consider that a confirmed design choice?

I disagree completely about naming a variable of non-pointer type "Nil". That is extremely unidiomatic Go because it conflicts with the language's own definitions of the term, and in discussion elsewhere I've already linked the part of the language spec which defines zero versus nil. But this PR is probably not the topical place to relitigate this naming, either, seeing as how the term is not used anywhere in the diff body we're discussing.

I'm strongly in favor of Nil as a constant. While Zero is technically correct, it's confusing.

However, as a method, I'd be fine with IsEmpty (or IsZero but I'd rather not).

As for the method, golang is pretty good about inlining one-line methods so it should be fine. After doing a bit of decompiling, it looks like we should go with:

func (c Cid) IsNil() bool { return c.string == "" }

c == (Cid{}) seems to cost us an extra xor zeroing something out. c == Nil costs us a mov.

See: https://go.godbolt.org/z/CHyhZl

@Stebalien the IsNilString one looks wrong. The compiler is getting really smart there and looks like its comparing the empty string to itself, and calling that the output.

(discussed on slack)

The assembly was correct. The generated instruction was testq AX, AX which tests AX & AX to see if it's non-nil.

warpfork · 2018-08-27T23:15:51Z

_rsrch/cidiface/README.md

+
+Option F (a struct wrapper) means we can prevent casting into our Cid type.
+Option B (typedef string) can be declared a `const`.
+Are there any other concerns that would separate the two choices?


I was going to speculate about if one would be easier/faster for refmt to munge than the other, because struct-containing-string seems like it going to require at least a few more reflection calls...

But on second thought that probably doesn't matter at all; CIDs are going to be using transform funcs either way because the serial type is either bytes (which is a different reflect.Kind than string, so needs transform) or base-$n$ encoded string (which... needs to be de-base-$n$'d, so that's a transform). So it's the same.

Are there any other things that would make B/F distinct? 🤔

warpfork · 2018-08-29T13:00:43Z

I think this has generated some good discussion, and we could merge it now.

In review: The diff body improves the README by enumerating a new option; it mentions the casting and const issues; and it provides some evidence of memory layout impact. Thus, it's strictly better documentation and more information than before.

Since we don't seem to be coming up with any more distinctions between B/F, and just thinking out loud about how we balance them, I think we can move on. As with the prior PR, this is all contained in a directory which is research content only and affects no APIs, so there's no splash radius to consider before merging.

Inspect memory layout of struct wrapping string.

924534b

It's also viable. Options list expanded. (And regretting my ordering of it now. Wish I'd thought of this one and realized it's distinct earlier.)

ghost assigned warpfork Aug 27, 2018

ghost added the status/in-progress In progress label Aug 27, 2018

Stebalien reviewed Aug 27, 2018

View reviewed changes

warpfork commented Aug 27, 2018

View reviewed changes

Stebalien approved these changes Aug 30, 2018

View reviewed changes

Stebalien merged commit 1766ab0 into master Aug 30, 2018

ghost removed the status/in-progress In progress label Aug 30, 2018

warpfork deleted the rsrch-cid-as-struct-wrapped-str branch August 30, 2018 09:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cid implementation variations++ #72

cid implementation variations++ #72

warpfork commented Aug 27, 2018

Stebalien Aug 27, 2018

warpfork Aug 27, 2018 •

edited

Loading

Stebalien Aug 28, 2018

warpfork Aug 28, 2018

Stebalien Aug 28, 2018

kevina Aug 29, 2018

warpfork Aug 29, 2018

Stebalien Aug 29, 2018

whyrusleeping Aug 29, 2018

Stebalien Aug 29, 2018

warpfork Aug 27, 2018

warpfork commented Aug 29, 2018

cid implementation variations++ #72

cid implementation variations++ #72

Conversation

warpfork commented Aug 27, 2018

Choose a reason for hiding this comment

warpfork Aug 27, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

warpfork commented Aug 29, 2018

warpfork Aug 27, 2018 •

edited

Loading