Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: opt-in way to allow empty list of roots in CAR headers #461

Merged
merged 4 commits into from
Aug 3, 2023

Conversation

willscott
Copy link
Member

Allow reading a car even when roots is empty

@hannahhoward
Copy link
Collaborator

This needs @rvagg 's eyes. The issue has never been the technical implementation. The problem, according to @rvagg, is that aspects of Filecoin dealmaking depend on this not being empty and are indirectly using go-car to validate the root CID is not empty. I'm not 100% clear this is still a serious issue but it requires some audit of ecosystem code.

lidel
lidel previously requested changes Jul 27, 2023
car.go Outdated
Comment on lines 144 to 147
if len(ch.Roots) == 0 {
return nil, fmt.Errorf("empty car, no roots")
}

Copy link
Contributor

@lidel lidel Jul 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💭 maybe we could add new NewCarReader that takes variadic CarReaderOptions (similar to what we already have for *Writer) and have option that adjust empty-roots behavior (error or noop), and switch to that explicit constructor in boxo/kubo/caboose?

func NewCarReader(r io.Reader, opts ...CarReaderOption) (*CarReader, error) {
    // create the CarReader
    cr := &CarReader{reader: r}

    // apply the options
    for _, opt := range opts {
        opt(cr)
    }

    // return the CarReader
    return cr, nil
}

And CarReaderOption for this PR could be something like

func WithErrorOnEmptyRoots(flag bool) CarReaderOption {
    return func(cr *CarReader) {
        cr.errorOnEmptyRoots = flag
    }
}

And error only when flag is true.

The original NewCarReader would always call with true, so no change in default behavior:

func NewCarReader(r io.Reader) (*CarReader, error) {
    return NewCarReader(r, WithErrorOnEmptyRoots(true))
}

This way:

  1. we don't need to worry about breaking Filecoin and can ship this without impacting existing users
  2. have a useful primitive for customizing parser behavior going forward

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the PR to use this pattern. Does this look reasonable?

Copy link
Contributor

@lidel lidel Jul 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm.. your version was breaking API change and removed tests, probably not what we want 🙃
I've pushed bigger refactor in 67151fc – it maintains backward API compatibility, so the old code like Filecoin one will work the same.

At the same time the new code can switch to NewCarReaderWithOptions to define explicit behavior, and even Filecoin can slowly migrate to that too, once empty roots are validated elsewhere.

Copy link
Contributor

@lidel lidel Jul 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rvagg when you are back, mind eyeballing this one? 🙏

My suggestion (67151fc) is to keep NewCarReader as-is, and add NewCarReaderWithOptions and WithErrorOnEmptyRoots(bool) which allows both Filecoin and IPFS use cases to overide implicit behavior.

@lidel lidel changed the title Fix empty car root fix: allow empty list of roots in CAR headers Jul 27, 2023
willscott and others added 2 commits July 27, 2023 22:11
this allows us to keep backward compatibility
and explicitly enable both behaviors, which means
code that expects error can be migrated into implicit flag
over time
car_test.go Outdated
Comment on lines 166 to 170
}, {
"{version:1}",
"0aa16776657273696f6e01",
"empty car, no roots",
"",
Copy link
Contributor

@lidel lidel Jul 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should keep this test, we need to have test for both root present and no root, to guard the behavior. Restored in 67151fc + added explicit test for WithErrorOnEmptyRoots(true|false)

@lidel lidel dismissed their stale review July 31, 2023 23:37

addressed in 67151fc

@lidel lidel changed the title fix: allow empty list of roots in CAR headers fix: opt-in way to allow empty list of roots in CAR headers Aug 2, 2023
Copy link
Contributor

@lidel lidel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is this no longer need explicit in-depth approval from @rvagg because we no longer make breaking change to API and we do not change the default behavior.

Disabling error on empty roots is opt-in via new constructor, which is backward-compatible and imo we could ship this as go-car v0.7.0.

cc @willscott @rvagg @aschmahmann @aarshkshah1992 – any concerns?

@willscott
Copy link
Member Author

@lidel did you see my one comment on your commit?

you keep the state of the header check on the reader, even though it's only used in the constructor.
I asked:

do we need to keep this state around on the car reader?
we only use the option in the constructor itself, so it seems preferable to have a transient config struct for the option to > modulate, rather than holding onto the boolean for no reason.

@lidel
Copy link
Contributor

lidel commented Aug 2, 2023

@willscott I did not see it. In the future, please drop comments like this on PRs for visibility 🙏

I made a concious decision to remove carReaderConfig because this is not a one-off feature.
We create a surface for future options, and roots is not a representative case.

Other flags will modify Reader behavior beyond the header.

For example, we could add WithErrorOnIdentityBlock or WithIgnoreInvalidBlocks and these need to be kept around for the entire life of the reader to serve their purpose, so no point in having a separate struct just for a single boolean around roots behavior.

@willscott
Copy link
Member Author

cool. I'm happy with this as is.

@masih or @rvagg can you approve as a code owner?

@rvagg
Copy link
Member

rvagg commented Aug 3, 2023

I guess this is OK, although we did say that we weren't going to touch the v0 package anymore.

I do have a v3 branch here that's nearly ready to propose for a semver-major bump that combines v0 and v2 code into a single package so we no longer have this awkward ipld/go-car and ipld/go-car/v2 import business and still need to rely on this older v0 stuff. One of the aims of that was to do away with the zero-roots checks entirely. I could push forward with that if this is actually that important. But as I said on Slack, it seems to me that getting to the bottom of why we are seeing zero-root CARs in the Rhea flow seems like a pretty important thing to be working on; we don't have a good explanation for it and that's a bit of a problem.

@willscott
Copy link
Member Author

@rvagg - but we have continued to update v1. - e.g with the updated use of go-merkledag / etc.

I think we do this now so we can move forward, rather than blocking on not having a good interface for reading this type of car

@willscott willscott merged commit 92e0b87 into master Aug 3, 2023
26 checks passed
@willscott willscott deleted the fix/empty-car-root branch August 3, 2023 10:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants