Add ARC caching and bloom filter for blockstorage #2885

Kubuxu · 2016-06-21T19:07:25Z

Resolves #2850
License: MIT
Signed-off-by: Jakub Sztandera kubuxu@protonmail.ch

whyrusleeping · 2016-06-23T00:26:15Z

blocked on #2601

Kubuxu · 2016-06-23T13:54:20Z

@kevina In other issues you said that AllKeysChan my be hard on performance, what is the alternative?

kevina · 2016-06-23T19:40:11Z

@Kubuxu, unfortunately if you use the datastore there are not any. I decided that for the filestore it would make more sense to just use the LevelDB directly.

If I where you I try hacking go-datastore and increase the channel buffer size here, https://github.com/ipfs/go-datastore/blob/master/query/query.go#L185 and see if that speeds things up (try a buffer size of say 16, than maybe something huge like 1024 or larger).

To see how much overhead is really involved you could hack go-datastore to be able to get the underlying LevelDB involved and query it directly.

Kubuxu · 2016-07-02T04:45:54Z

Rebased on to master, tests are running.

Should I implement block cache in this PR or in a different one?

whyrusleeping · 2016-07-02T04:46:45Z

@Kubuxu do it separately if its not too much trouble.

whyrusleeping · 2016-07-03T19:26:39Z

rebase on master so we dont get the docker failures?

Kubuxu · 2016-07-03T20:13:15Z

I should also squash everything as first commits are failing.

Replace write_cache with bloom_cache Improve ARC caching Fix small issue in case of AllKeysChan fails deps: Update go-datastore blocks/blockstore: Invalidate ARC cache before deletin block deps: Update go-datastore License: MIT Signed-off-by: Jakub Sztandera <kubuxu@protonmail.ch>

Kubuxu · 2016-07-03T20:17:20Z

Squashed.

whyrusleeping · 2016-07-04T17:09:07Z

core/builder.go

@@ -131,7 +131,7 @@ func setupNode(ctx context.Context, n *IpfsNode, cfg *BuildCfg) error {

 	var err error
 	bs := bstore.NewBlockstore(n.Repo.Datastore())
-	n.Blockstore, err = bstore.WriteCached(bs, kSizeBlockstoreWriteCache)
+	n.Blockstore, err = bstore.BloomCached(bs, 256*1024, kSizeBlockstoreWriteCache)


lets make this a constant

I am introducing CacheOpts in the ARC caching of whole blocks I am working on right now. Would it be ok if I added it in that PR?

License: MIT Signed-off-by: Jakub Sztandera <kubuxu@protonmail.ch>

Kubuxu · 2016-07-04T18:35:49Z

@whyrusleeping I will add the context passing in the next PR too. I will have to rework all places where ctor is used either way.

whyrusleeping · 2016-07-04T18:38:38Z

@Kubuxu i'm still not sure this is ready. How does it fare on long running nodes? What happens when the filter gets full, or there are lots of deletes, etc

Kubuxu · 2016-07-04T18:54:46Z

Main problem would be if there is a lot of deletes, then the false positive ratio would rise proportionally with ratio of blocks deleted, In worst case scenario (all block deleted) the performance would be where it was before the bloom filter (+1ms of latency).

Other case would be if the repo was much bigger in comparison with the bloom filter preselected size. Our current settings should provide optimal operation (<1% of false positives) up to 30k blocks in storage, at about 100k blocks it will rise to about 15% of false positives (but still it reduces number of Has requests by roughly 73%, 2% of real positives and 15% false positives).

I plan to add auto rebuild (and expand) but I would need concrete metrics for that of real operation. That is why I plan working this week on the metrics interface I mentioned on sprint.

whyrusleeping · 2016-07-04T18:56:56Z

@Kubuxu alright, that sounds good to me. Lets get this in and get some mileage on it

jbenet · 2016-08-27T11:18:43Z

it would be great to see benchmark graphs on a large typical workload with and without this PR.

We need to get into the habit of benchmarking and graphing with large batteries of tests.

Kubuxu · 2016-08-29T13:56:37Z

In this case the aim of introducing Bloom filtering and ARC caching on Has requests wasn't purely about performance but reducing load on the underlying file system, and thus hard drives.

We are not caching blocks (we are trying that in #2942, the performance benefits isn't clear), but only if block is available in the datastore or not.

Best metric for this change set will be reduction of disk hits when caching is available, on which I was supposed to work quite a time ago by improving metric collection.

Kubuxu added the status/in-progress In progress label Jun 21, 2016

Kubuxu force-pushed the feature/bloom-cache branch 6 times, most recently from 4765754 to 91e0251 Compare June 22, 2016 19:49

Kubuxu changed the title ~~[WIP] Add bloom filter~~ [WIP] Add ARC caching and bloom filter for blocks Jun 22, 2016

Kubuxu changed the title ~~[WIP] Add ARC caching and bloom filter for blocks~~ [WIP] Add ARC caching and bloom filter for blockstorage Jun 22, 2016

Kubuxu force-pushed the feature/bloom-cache branch 2 times, most recently from db06568 to 17876d9 Compare June 22, 2016 21:49

whyrusleeping added this to the Ipfs 0.4.3 milestone Jun 23, 2016

Kubuxu force-pushed the feature/bloom-cache branch from c226019 to f2130f5 Compare June 23, 2016 06:57

Kubuxu changed the title ~~[WIP] Add ARC caching and bloom filter for blockstorage~~ Add ARC caching and bloom filter for blockstorage Jun 24, 2016

Kubuxu force-pushed the feature/bloom-cache branch from f2130f5 to 983eaf4 Compare July 2, 2016 04:38

Kubuxu force-pushed the feature/bloom-cache branch from 2be444d to d61f526 Compare July 2, 2016 05:34

Kubuxu added need/review Needs a review and removed status/in-progress In progress labels Jul 2, 2016

Kubuxu force-pushed the feature/bloom-cache branch 2 times, most recently from 7b982fe to 7060a27 Compare July 2, 2016 06:41

Kubuxu force-pushed the feature/bloom-cache branch from 7060a27 to 926d79e Compare July 3, 2016 20:09

Kubuxu force-pushed the feature/bloom-cache branch from 926d79e to 5d83d89 Compare July 3, 2016 20:17

Kubuxu mentioned this pull request Jul 4, 2016

Sprint: June 20 → July 3 ipfs/team-mgmt#114

Closed

whyrusleeping reviewed Jul 4, 2016
View reviewed changes

Kubuxu added 3 commits July 4, 2016 20:12

blocks/blockstorage: use automic for bloom.active

98f877a

License: MIT Signed-off-by: Jakub Sztandera <kubuxu@protonmail.ch>

test: fix races in bloomcache tests

e92e666

License: MIT Signed-off-by: Jakub Sztandera <kubuxu@protonmail.ch>

blocks/blockstore: style cleanup of bloomcache

f13506c

License: MIT Signed-off-by: Jakub Sztandera <kubuxu@protonmail.ch>

whyrusleeping merged commit ad5730d into master Jul 4, 2016

whyrusleeping deleted the feature/bloom-cache branch July 4, 2016 18:57

jbenet mentioned this pull request Aug 28, 2016

Code Review 0.4.2..0.4.3-rc3 #3138

Closed

58 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ARC caching and bloom filter for blockstorage #2885

Add ARC caching and bloom filter for blockstorage #2885

Kubuxu commented Jun 21, 2016 •

edited

Loading

whyrusleeping commented Jun 23, 2016

Kubuxu commented Jun 23, 2016

kevina commented Jun 23, 2016

Kubuxu commented Jul 2, 2016

whyrusleeping commented Jul 2, 2016

whyrusleeping commented Jul 3, 2016

Kubuxu commented Jul 3, 2016

Kubuxu commented Jul 3, 2016

whyrusleeping Jul 4, 2016

Kubuxu Jul 4, 2016 •

edited

Loading

whyrusleeping Jul 4, 2016

Kubuxu commented Jul 4, 2016

whyrusleeping commented Jul 4, 2016

Kubuxu commented Jul 4, 2016

whyrusleeping commented Jul 4, 2016

jbenet commented Aug 27, 2016

Kubuxu commented Aug 29, 2016

Add ARC caching and bloom filter for blockstorage #2885

Add ARC caching and bloom filter for blockstorage #2885

Conversation

Kubuxu commented Jun 21, 2016 • edited Loading

whyrusleeping commented Jun 23, 2016

Kubuxu commented Jun 23, 2016

kevina commented Jun 23, 2016

Kubuxu commented Jul 2, 2016

whyrusleeping commented Jul 2, 2016

whyrusleeping commented Jul 3, 2016

Kubuxu commented Jul 3, 2016

Kubuxu commented Jul 3, 2016

whyrusleeping Jul 4, 2016

Choose a reason for hiding this comment

Kubuxu Jul 4, 2016 • edited Loading

Choose a reason for hiding this comment

whyrusleeping Jul 4, 2016

Choose a reason for hiding this comment

Kubuxu commented Jul 4, 2016

whyrusleeping commented Jul 4, 2016

Kubuxu commented Jul 4, 2016

whyrusleeping commented Jul 4, 2016

jbenet commented Aug 27, 2016

Kubuxu commented Aug 29, 2016

Kubuxu commented Jun 21, 2016 •

edited

Loading

Kubuxu Jul 4, 2016 •

edited

Loading