Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pinning is slow when there are many pins #5221

Closed
Stebalien opened this issue Jul 13, 2018 · 29 comments
Closed

Pinning is slow when there are many pins #5221

Stebalien opened this issue Jul 13, 2018 · 29 comments
Labels
kind/bug A bug in existing code (including security flaws) status/deferred Conscious decision to pause or backlog topic/perf Performance

Comments

@Stebalien
Copy link
Member

Stebalien commented Jul 13, 2018

We store all pins in a single massive object so adding and removing pins is really slow when we have many pins.

This affects:

  • ipfs dag add --pin=true
  • ipfs add
  • ipfs pin add
  • ipfs pin rm

Listing pins also appears to be slow but for a different reason:

  1. ipfs pin ls buffers pins in memory before sending them back to the user (see pin ls should stream the result #6304).
  2. ipfs pin ls lists all pinned blocks, directly or indirectly, by default. Calling ipfs pin ls --type=recursive is much faster.

Proposed solutions:


I'm filing this issue so we can have a single issue that succinctly describes the entire issue and all variants.

@Stebalien Stebalien added kind/bug A bug in existing code (including security flaws) topic/perf Performance labels Jul 13, 2018
@Stebalien Stebalien added the status/deferred Conscious decision to pause or backlog label Jul 13, 2018
@ghost
Copy link

ghost commented Jul 13, 2018

Also sounds like another use case for an embedded graph database

@Stebalien
Copy link
Member Author

Not really. We have MFS, we can just use that. The current blockers are:

  • Private content (meh, IPFS doesn't really have this at the moment anyways).
  • Fancy pins. Currently, the only "fancy pin" we have is the "direct" pin (non-recursive).
  • Background "fetch" jobs.
  • The ability to wait on the background fetch job.

Unfortunately, this'll only get worse as we hack in new pin types for cluster. We need some way to specify (in unixfs) how a file/directory should be pinned (where pin policies higher up the directory tree take precedence).

@Stebalien
Copy link
Member Author

The current thought here is to introduce an intermediate fix that stores pins in go-ipld-hamt. Blockers:

  • Make the refmt version of go-ipld-cbor correctly handle CIDs.
  • Merge the refmt patches into go-ipld-cbor.
  • Finish up go-ipld-hampt.

@ivan386
Copy link
Contributor

ivan386 commented Aug 2, 2018

Maybe need just use read only or archive flag for pined blocks in the underline file system?

@Stebalien
Copy link
Member Author

Unfortunately, it's not quite that simple. Pinning happens at a higher layer and not all of our datastores store one file per block.

@pjz
Copy link

pjz commented Aug 23, 2018

This is also causing an issue with monitoring over at netdata/netdata#3156 as it makes a lot of 'ipfs pin ls' calls.

@bonedaddy
Copy link
Contributor

bonedaddy commented Aug 24, 2018

Is the pinset object stored and read/written on disk when operations are performed? If so wouldn't it be possible to load the object into memory and read/write to there to get high performance IO with memory access? You could copy the object to disk as a backup, but you wouldn't incur expensive read operations as you are reading from the in-memory object. This would serve as a reasonable intermediate fix until the pinning system at large is reworked.

@Stebalien
Copy link
Member Author

Reading is fast, we store the pinset in memory. The slow part is flushing to disk.

@pjz
Copy link

pjz commented Aug 30, 2018

Why would an 'ipfs pin ls' flush to disk? I think there must be something else going on, if the netdata guys are seeing an inordinate load due to an 'ipfs pin ls' being sent once every 5s or so.

@Stebalien
Copy link
Member Author

pin ls by default lists all indirectly pinned objects (children of recursive pins). I don't know why it does this by default but it does...

You can list pins you added by running ipfs pin --type=recursive; ipfs pin --type=direct.

@pjz
Copy link

pjz commented Aug 30, 2018

That still doesn't answer why the guys over on netdata/netdata#3156 are seeing massive IPFS resource usage when 1) they have a large repo (several thousand objects) and 2) they turn on monitoring (which does an 'ipfs pin ls' every few seconds). Is there instrumentation they could turn on?

@Stebalien
Copy link
Member Author

Stebalien commented Aug 30, 2018

It's listing every single object (block) that has been pinned. It's consuming a ton of ram because we, unfortunately, create a list of pins in-memory before returning them to the client. We should fix* (the second part) this but doing so will be a breaking API change so we'll have to be careful.

@Stebalien
Copy link
Member Author

It's also probably garbage collecting a bunch (we're working on some fixes to CIDs that'll make them allocate less but that's still in progress).

@bonedaddy
Copy link
Contributor

bonedaddy commented Aug 30, 2018

@pjz So on my own nodes to avoid having to constantly poll IPFS and incur slow performance from examining the pinset, I maintain a database which contains an exact copy of the pins my IPFS nodes currently have. Any updates that would effect the pinset must also update the database.

By doing this, I avoid having to contact my IPFS node and perform performance impacting operations like ipfs pin ls.

Yes while this isn' desirable it has been working very well but has a couple of considerations, namely that all operations which effect pinset must also update the DB. Don't forget, IPFS is still very new so sometimes you have to make small accommodations until such issues are resolved.

@djdv
Copy link
Contributor

djdv commented Aug 30, 2018

I maintain a database which contains an exact copy of the pins my IPFS nodes

I should mention that in working with pins, I've also come to the pattern of maintaining my own cache, to avoid delay on large nodes, even when only listing recursive pins,

In my specific case, I'm interested in both the listing being more performant, but also having some means of notification from the node. Like an event that I can subscribe to, which signals when the pinset has changed.
This would allow me to maintain my own state, poll once and then just poll once more (or do some means of a delta with info from the event) on state change, instead of polling based around time or some other arbitrary metric.

For context, I'm dealing with ipfs mount at the moment, and refresh the listing for /ipfs entails getting the node's pinset.
I'm also interested in other events from the node, such as knowing when keys have changed, mfs has changed, etc.
I'm willing to bet monitoring tools would be interested in this as well.

@bonedaddy
Copy link
Contributor

bonedaddy commented Aug 30, 2018

Yes, absolutely it's made my node perform significantly better. i've currently begun moving to a model where the only time I need to talk to my IPFS node to list anything is for crucial operations. Otherwise, everything else that isn't a write operation should be reading from my cache/database

@pjz
Copy link

pjz commented Aug 31, 2018

While those are great workarounds, they're not really feasible for a general monitoring solution. I guess they'll just have to wait until the IPFS server gets it together. I think it's clear that whatever datastructure it's using needs to be re-evaluated or supplemented to make this kind of monitoring/usage not cause it to eat itself.

@Stebalien
Copy link
Member Author

So, adding pins should be faster. But listing every single object that has been pinned (directly or indirectly by some recursive pin) in your datastore will always be somewhat slower.

@pjz
Copy link

pjz commented Sep 8, 2018

If everyone's solution is to maintain a parallel cache of what pins exist... why not have IPFS do that internally instead? Keep a cache that's invalidated on add/remove of pins, but otherwise is untouched. Then repeated calls to 'ipfs pin ls' would be trivial. Maybe make 'ipfs pin verify' also serve as a way to manually invalidate the cache/force a rebuild of it.

@djdv
Copy link
Contributor

djdv commented Sep 8, 2018

@pjz
I think that could help in improving the performance.
I know that awareness of the node's state is a separate issue, but if we have to come up with a messaging system for invalidating some node-wide, pinset-cache, we may as well have a system to broadcast those same events as well.
For those that still want to be made aware of when the state has changed.
The practical reason for this, would still just be to avoid unnecessary calls via polling, in long-lived processes.
Even if pin ls is fast, it'd be nice to update your copy of the pinset, only when it's been changed.

However, this only seems useful to implement if there's more than 1 event (more than just "pins have changed").
I mentioned some others before, like writes to MFS, IPNS key has been created/deleted/updated, etc.
This would allow people to maintain their own cache of various states, if they like, but still have a generic implementation underneath for fast operation in the general case.

Any opinions on this?

@pjz
Copy link

pjz commented Sep 9, 2018

What you describe sounds somewhat like a way to tap into the logging system.

@obo20
Copy link

obo20 commented Jul 2, 2019

@Stebalien Has there been much progress / prioritization on this front? As we continue to scale, this becomes increasingly relevant.

@Stebalien
Copy link
Member Author

No progress.

@S3bb1
Copy link

S3bb1 commented Jul 3, 2019

We're also facing this issue with aroung 2mio hashes and around 4-500k pins, can we support you in any way? We're currently "workarounding" this with multiple ipfs instances

@Stebalien
Copy link
Member Author

@dirkmc you were looking into this for js-ipfs. Are you still planing on applying that same optimization to go-ipfs?

@dirkmc
Copy link
Contributor

dirkmc commented Jul 11, 2019

@Stebalien I'm currently doing some research to understand where the performance bottlenecks are with adding large numbers of files to go-ipfs, which will likely include performance analysis for pinning.

Before making any pinning optimizations, we'll likely want to decide if it makes sense for pins to be stored in the blockstore, which is a bigger conversation.

@Zorlin
Copy link

Zorlin commented May 9, 2021

I will soon need to pin a million+ pins, I'm hoping this can be improved

@Stebalien
Copy link
Member Author

This was actually fixed in go-ipfs 0.8.0, we just never closed the issue (see https://github.com/ipfs/go-ipfs/blob/master/CHANGELOG.md#-faster-local-pinning-and-unpinning). The number of pins you have should no longer matter when adding new pins.

@Zorlin
Copy link

Zorlin commented May 10, 2021

Awesome, thanks! Love your work. I'll report any issues with scalability later if we run into them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws) status/deferred Conscious decision to pause or backlog topic/perf Performance
Projects
None yet
Development

No branches or pull requests

9 participants