Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor UnixFS CoreAPI #5501

Merged
merged 24 commits into from
Oct 5, 2018
Merged

Refactor UnixFS CoreAPI #5501

merged 24 commits into from
Oct 5, 2018

Conversation

magik6k
Copy link
Member

@magik6k magik6k commented Sep 20, 2018

This implements most options applying to single files supported by ipfs add.

TODO:

  • --pin, --only-hash
  • docstrings
  • option handling cleanup

To keep this PR manageable, I'm leaving multi-file / filestore / etc. for next PRs

@magik6k magik6k added the topic/core-api Topic core-api label Sep 20, 2018
@ghost ghost assigned magik6k Sep 20, 2018
@ghost ghost added the status/in-progress In progress label Sep 20, 2018
@magik6k magik6k mentioned this pull request Sep 20, 2018
51 tasks
@magik6k magik6k changed the title WIP: More options for coreapi.unixfs.Add More options for coreapi.unixfs.Add Sep 25, 2018
@kevina
Copy link
Contributor

kevina commented Sep 28, 2018

This seams like a overcomplicated way of going about it. Was this discussed with any one before?

@Stebalien
Copy link
Member

@kevina which part?

It's using the functional options pattern (https://dave.cheney.net/2014/10/17/functional-options-for-friendly-apis) which we've started using pretty much everywhere. It allows us to easily extend these options.

@kevina
Copy link
Contributor

kevina commented Sep 28, 2018

I will withhold judgement until I will have a closer look at the "functional options pattern" link provided. (It just seams to me that it is adding a lot of not necessary useful indirection.)

// directly into CID instead of being stored and addressed by it's hash
//
// Note that while there is no hard limit on the number of bytes here, it should
// be kept at something reasonably low like 32b (default for 'ipfs add')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wording here makes it sounds like the value should be 32 bytes or smaller. That is misleading. Values less than 32 bytes should not be encouraged. The default of 32 was chosen so that when the Inline option is enabled it won't increase the size of the hashes, but a limit of 32 bytes is not very usefel. A more reasonable upper limit would be 64 bytes, but if you are not concerned with displaying it than values like 128 or even 256 are also okay and based on my informal testing do not cause problems anywhere in IPFS.

kevina
kevina previously requested changes Sep 29, 2018
Copy link
Contributor

@kevina kevina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the way the Inline option is handled.

core/coreapi/interface/options/unixfs.go Show resolved Hide resolved
@@ -38,7 +40,6 @@ var hello = "/ipfs/QmQy2Dw4Wk7rdJKjThjYXzfFJNaRKRHhHP5gHHXroJMYxk"
var helloStr = "hello, world!"

// `echo -n | ipfs add`
var emptyFile = "/ipfs/QmbFMke1KXqnYyBBWxB74N4c5SBnJMVAiMNRcGu6x1AwQH"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for removing this?

@kevina
Copy link
Contributor

kevina commented Sep 29, 2018

It's using the functional options pattern

Okay. That is actually a pretty good idea and I take back my earlier comments.

My personal preference is to complete the transformation and convert the add command to use the new code. This way we can be sure we don't miss anything, but I can understand if we don't want to do that all at once.

// Note that while there is no hard limit on the number of bytes, it should
// be kept at a reasonably low value, like 64 bytes if you intend to display
// these hashes. Larger values like 256 bytes will work fine, but may affect
// de-duplication of smaller blocks
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where did you get the de-duplication of smaller blocks from?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the wording isn't quite clear. The idea is that when you have many duplicated small blocks (100s of bytes), and a large inline limit, you can end up with quite a bit bigger DAG (i.e. the cumulative block size of the dag would be bigger).

If this is wrong / not clear enough, can you suggest better wording here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't think of that.

My main concern with larger blocks is if they get too large they might cause a problem somewhere even if they work find 98% of the time.

func (unixfsOpts) InlineLimit(limit int) UnixfsAddOption {
return func(settings *UnixfsAddSettings) error {
settings.InlineBlocks = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I would do this, setting the InlineLimit on the command line does not enable inlining.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, what are the cases somebody (other than in the command implementation) would set this option and then not want to inline blocks? It feels a bit like saying 'I want to inline blocks up to 123 bytes and I want to inline blocks'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A user might want to change the defaults. That is, I might want to create a set of default options that change the default inline limit without actually enabling inlining (unless requested by the user).

Note: I don't actually care one way or the other about this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, yeah. The options case makes enough sense, I'll go with that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was thinking along the same line as @Stebalien. Also, since I assume the idea is to switch the ipfs add command to use the new API it best to keep the behavior constant.

@@ -124,13 +126,26 @@ func (unixfsOpts) RawLeaves(enable bool) UnixfsAddOption {
}
}

// InlineBlocks tells the adder to inline small blocks into CIDs
func (unixfsOpts) InlineBlocks(enable bool) UnixfsAddOption {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think just Inline will be clear enough. It will also be consistent with the command line.

// de-duplication of smaller blocks.
//
// Setting this value too high may cause various problems, such as render some
// blocks unfetchable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are saying too much here and at this point will just be confusing the user. I would just simplify this with

Note that while there is no hard limit on the number of bytes, it should be kept at a reasonably low value, such as 64 and no more then 1k. Setting this value too high may cause various problems, such as render some blocks unfetchable.

@Stebalien thoughts?

@kevina kevina dismissed their stale review October 2, 2018 23:11

Changes Addressed.

@kevina
Copy link
Contributor

kevina commented Oct 2, 2018

@Stebalien do we want to get this in as is? I personally would rather we wait until all the options are implemented and the ipfs add command is converted. This was we get the benefit of all the sharness tests to make sure here are no bugs or changes in behavior.

@Stebalien
Copy link
Member

I kind of agree. @magik6k does it look like that would be a large addition to this patch?

@magik6k
Copy link
Member Author

magik6k commented Oct 2, 2018

I don't really know, it kind of depends on how well go-ipfs-cmdkit/files stuff will fit here, I just didn't fiddle with that yet, so it might be anywhere from a few small commits to a few-thousand-diff-lines monster (I'm quite positive that this will be on the manageable side).

I'm fine with going either way, just want to make sure this doesn't get frozen because it grew too big.

@kevina
Copy link
Contributor

kevina commented Oct 2, 2018

@magik6k I would attempt to do the complete conversion. If it becomes unmanageable then I would split out parts into a separate p.r.

I would rather not have lots of duplicate code in go-ipfs if this commit lands and then ipfs add doesn't get converted for some reason.

@Stebalien
Copy link
Member

I'm fine with going either way, just want to make sure this doesn't get frozen because it grew too big.

We can also do that as a followup PR against this one.

@Stebalien Stebalien added RFM status/blocked Unable to be worked further until needs are met labels Oct 2, 2018
@magik6k magik6k removed RFM status/blocked Unable to be worked further until needs are met labels Oct 3, 2018
@magik6k magik6k changed the title More options for coreapi.unixfs.Add Refactor UnixFS CoreAPI Oct 3, 2018
License: MIT
Signed-off-by: Łukasz Magiera <magik6k@gmail.com>
License: MIT
Signed-off-by: Łukasz Magiera <magik6k@gmail.com>
License: MIT
Signed-off-by: Łukasz Magiera <magik6k@gmail.com>
License: MIT
Signed-off-by: Łukasz Magiera <magik6k@gmail.com>
License: MIT
Signed-off-by: Łukasz Magiera <magik6k@gmail.com>
License: MIT
Signed-off-by: Łukasz Magiera <magik6k@gmail.com>
License: MIT
Signed-off-by: Łukasz Magiera <magik6k@gmail.com>
License: MIT
Signed-off-by: Łukasz Magiera <magik6k@gmail.com>
License: MIT
Signed-off-by: Łukasz Magiera <magik6k@gmail.com>
@kevina kevina self-requested a review October 4, 2018 00:02
Copy link
Contributor

@kevina kevina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left one comment in an earlier review about documentation for the InlineLimit option that still needs to be addressed.

Otherwise this mostly LGTM and seams solidly done. I won't be able to go over this with a fine tooth comb but will look it over more closely in a day or so as I don't want this too go to long before merging.

@kevina kevina self-requested a review October 4, 2018 01:48
License: MIT
Signed-off-by: Łukasz Magiera <magik6k@gmail.com>
License: MIT
Signed-off-by: Łukasz Magiera <magik6k@gmail.com>
// Setting this value too high may cause various problems, such as render some
// blocks unfetchable
// Note that while there is no hard limit on the number of bytes, it should be
// kept at a reasonably low value, such as 64 and no more than 1k. Setting this
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Stebalien should we say 1k here or use a lower value?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about "it should be kept at a reasonably low value, such as 64; implementations may choose to reject anything larger". We can always increase this if someone complains.

(Note: I never really intend to reject anything less than 1k but users tend to ignore all but strongly worded warnings).

Copy link
Contributor

@kevina kevina Oct 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. I was actually trying to avoid creating too strong a suggestion. But I am okay with.

// Note that while there is no hard limit on the number of bytes, it should be
// kept at a reasonably low value, such as 64; implementations may choose to
// reject anything larger.

And just leave it at that.

(addressing CR)

License: MIT
Signed-off-by: Steven Allen <steven@stebalien.com>
@ghost ghost assigned Stebalien Oct 5, 2018
License: MIT
Signed-off-by: Łukasz Magiera <magik6k@gmail.com>
@Stebalien Stebalien merged commit 5ba0976 into master Oct 5, 2018
@ghost ghost removed the status/in-progress In progress label Oct 5, 2018
@Stebalien Stebalien deleted the feat/coreapi/unixfs branch October 5, 2018 18:18
@Stebalien
Copy link
Member

😅 merged before we get any new conflicts.

hacdias pushed a commit to ipfs/boxo that referenced this pull request Jan 27, 2023
Refactor UnixFS CoreAPI

This commit was moved from ipfs/kubo@5ba0976
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic/core-api Topic core-api
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants