Skip to content
This repository has been archived by the owner on Jun 6, 2023. It is now read-only.

Change Label to be a Union of strings and bytes #1580

Merged
merged 17 commits into from
Apr 6, 2022
Merged

Conversation

arajasek
Copy link
Collaborator

@arajasek arajasek commented Mar 12, 2022

Implements FIP-0027. Will need a migration.

@arajasek arajasek changed the title Market state: Change Label to be a Union of strings and bytes DO NOT MERGE: Market state: Change Label to be a Union of strings and bytes Mar 12, 2022
Copy link
Member

@anorth anorth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, very helpful.


var EmptyDealLabel = DealLabel{}

func NewDealLabelFromString(s string) (DealLabel, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: deal.NewLabelFromString, don't repeat the word "deal".

@ZenGround0 ZenGround0 changed the base branch from release/v7 to master April 4, 2022 22:55
@codecov-commenter
Copy link

codecov-commenter commented Apr 5, 2022

Codecov Report

Merging #1580 (f65d10f) into master (5f5089f) will increase coverage by 0.0%.
The diff coverage is 69.2%.

@@          Coverage Diff           @@
##           master   #1580   +/-   ##
======================================
  Coverage    69.0%   69.0%           
======================================
  Files          73      74    +1     
  Lines        8798    8887   +89     
======================================
+ Hits         6077    6140   +63     
- Misses       1859    1873   +14     
- Partials      862     874   +12     

@ZenGround0 ZenGround0 changed the title DO NOT MERGE: Market state: Change Label to be a Union of strings and bytes Change Label to be a Union of strings and bytes Apr 5, 2022
@ZenGround0 ZenGround0 marked this pull request as ready for review April 5, 2022 18:01
@ZenGround0 ZenGround0 requested a review from a team as a code owner April 5, 2022 18:01
@ZenGround0
Copy link
Contributor

Can't tag you for review since you opened this @arajasek but please review this too

Copy link
Member

@rvagg rvagg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👌 very nice, I'm excited about this

Copy link
Collaborator Author

@arajasek arajasek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Disclaimer: I'm not sure how much of this came from @ZenGround0, I may be disagreeing with past-Aayush)

I have some concerns about the variance in what can and cannot be constructed. For example:

  • You can unmarshal labels whose lengths are longer than DealMaxLabelSize. I'm not sure what the impact of this might be, but it feels undesirable -- we could mitigate by having the Unmarshal method use the same NewLabelFromString and NewLabelFromBytes constructors?
  • Am I misunderstanding, or would NewLabelFromString("").IsEmpty() return false? Is that intentional? We could mitigate by making IsEmpty() return len(label.s) == 0 && len(label.bs) == 0?
  • Is the ToString / ToBytes method purely for testing? It feels a little weird to have them -- in my head the point of this struct is to say "your label can be 1 of 2 (potentially very different) types, and everything will work seamlessly". Converting between them is odd to me.

actors/builtin/market/deal.go Outdated Show resolved Hide resolved
@ZenGround0
Copy link
Contributor

Yes bullets 1 and 2 were inherited from your draft of the PR.

For the first point, this is just the current state of the chain, after unmarshalling proposals we then further check we're not exceeding max label length. Seems like the right breakdown. Let cbor worry about cbor limits let application code worry about application limits.

Am I misunderstanding, or would NewLabelFromString("").IsEmpty() return false? Is that intentional? We could mitigate by making IsEmpty() return len(label.s) == 0 && len(label.bs) == 0?

Yeah its intentional. Its a type thing not a value thing. Empty means contains no string or bytes and the empty string is a string. EmptyLabel serializes to a string but its not a string. This is all just shorthand to be a bit clearer than comparing both fields to nil in a few places. I'll add a comment but don't think this needs to change. Your suggestion is outdated with the pointer representation I changed to on steb's suggestion. b and s are pointers so never have length.

For the final point you added these when you made this PR and they turned out useful for testing. I made ToBytes convert a string on steb's suggestion. But I think your right that converting here doesn't make sense so I'll change that back to erroring if its not bytes.

Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have done a few things a bit differently, but this looks correct.

Comment on lines 30 to 33
type DealLabel struct {
s *string
bs *[]byte
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

personal opinion: I'd write this as

type DealLabel struct {
	bs []byte
	notString bool
}

Your version is more "go-like", but also requires extra allocations and may be a bit annoying to deal with...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ironically I was following up on your comment here is this not the "put them behind pointers" approach?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to move it to the above approach

Copy link
Contributor

@ZenGround0 ZenGround0 Apr 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I think we want isString to make the 0 value serialize to cbor text not cbor bytes nvm notString is the way it is to make this nice

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ironically I was following up on your comment #1580 (comment) is this not the "put them behind pointers" approach?

Your version is the "put them behind pointers approach". My version here is the "enum thing" approach (which, after looking at this, seems simpler.

nvm notString is the way it is to make this nice

Welcome to go! zero values are over there 👉 .

Comment on lines 62 to 64
func (label DealLabel) IsEmpty() bool {
return label.s == nil && label.bs == nil
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this accurate? I'd assume that IsEmpty would mean "is empty". I.e., "" would also be empty.

@Stebalien
Copy link
Member

Yeah its intentional. Its a type thing not a value thing. Empty means contains no string or bytes and the empty string is a string. EmptyLabel serializes to a string but its not a string. This is all just shorthand to be a bit clearer than comparing both fields to nil in a few places. I'll add a comment but don't think this needs to change. Your suggestion is outdated with the pointer representation I changed to on steb's suggestion. b and s are pointers so never have length.

We shouldn't distinguish between the two kinds of empty. That is, DealLabel{} and DealLabel{s: ""} should be indistinguishable.

@arajasek
Copy link
Collaborator Author

arajasek commented Apr 6, 2022

Sorry, I'm still confused about the emptiness thing. Some qs:

  • Does the canonical EmptyDealLabel change if it's serialized and deserialized? I think it might go from nil, nil to &"", nil, which seems odd if it's canonical?
  • Can the IsEmpty() property change if a DealLabel is serialized and deserialized? I think it should be a requirement that none of the query-able properties change through the serialization-deserialization process, but I think that's not always the case.

I might be wrong about some of these claims.

return err
}

if _, err := w.Write((label.bs)[:]); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if _, err := w.Write((label.bs)[:]); err != nil {
if _, err := w.Write(label.bs); err != nil {

return err
}
} else { // label.IsString()
if len(label.bs) > cbg.MaxLength {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Se use MaxByteArrayLen in both cases (on further inspection). I'd just lift this case to the top.

return bytes.Equal(l.bs, o.bs) && l.notString == o.notString
}

func (label *DealLabel) MarshalCBOR(w io.Writer) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can vastly simplify this now:

  1. If null, write empty string.
  2. If too large, return an error.
  3. Set a majorType variable depending on the type.
  4. Write the value (no need for a conditional).

Comment on lines 152 to 179
if maj == cbg.MajTextString {

if length > cbg.MaxLength {
return fmt.Errorf("label string was too long (%d), max allowed (%d)", length, cbg.MaxLength)
}

buf := make([]byte, length)
_, err = io.ReadAtLeast(br, buf, int(length))
if err != nil {
return err
}
s := string(buf)
if !utf8.ValidString(s) {
return fmt.Errorf("label string not valid utf8")
}
label.bs = buf
label.notString = false
} else if maj == cbg.MajByteString {

if length > cbg.ByteArrayMaxLen {
return fmt.Errorf("label bytes was too long (%d), max allowed (%d)", length, cbg.ByteArrayMaxLen)
}

bs := make([]uint8, length)
label.bs = bs
label.notString = true
if _, err := io.ReadFull(br, bs[:]); err != nil {
return err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to unify these codepaths.

  1. Make sure we have either a text string or a byte string.
  2. Set "not string" if it's a byte string.
  3. Check the length.
  4. Read the value.
  5. If a string, check if it's utf8.

Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code nits but otherwise LGTM

actors/builtin/market/deal.go Outdated Show resolved Hide resolved
actors/builtin/market/deal.go Outdated Show resolved Hide resolved
Comment on lines 102 to 105
if _, err := io.WriteString(w, string("")); err != nil {
return err
}
return nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can just return.

Comment on lines 121 to 126
if _, err := w.Write(label.bs); err != nil {
return err
}
}

return nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can just return.

Comment on lines 156 to 158
if !label.notString {
s := string(buf)
if !utf8.ValidString(s) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if !label.notString {
s := string(buf)
if !utf8.ValidString(s) {
if !label.notString && !utf8.ValidString(string(buf)) {

@ZenGround0 ZenGround0 merged commit cc74c2c into master Apr 6, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants