Skip to content
This repository has been archived by the owner on Feb 12, 2024. It is now read-only.

fix: Do not load all of a DAG into memory when pinning #2372

Merged
merged 9 commits into from
Aug 21, 2019

Conversation

achingbrain
Copy link
Member

@achingbrain achingbrain commented Aug 20, 2019

Given a CID, the dag. _getRecursive method returns a list of all descendents of the node with the passed CID. This can cause enormous memory usage when importing large datasets.

Where this method is invoked the results are either a) disgarded or b) used to calculate the CIDs of the nodes which is then bad for memory and CPU usage.

This PR removes the buffering and CID recalculating for a nice speedup when adding large datasets.

In my (non-representative, may need all the other unfinished async/iterator stuff) testing, importing folder of 4MB files totalling about 5GB files with content from /dev/urandom into a fresh repo with a daemon running in the background is now:

go-ipfs
real	3m43.741s
user	0m31.955s
sys	0m31.959s
js-ipfs
real	3m40.725s
user	0m7.352s
sys	0m4.489s

Which is nice.

fixes #2310

Given a `CID`, the `dag._recursiveGet` method returns a list of all
descendents of the node with the passed `CID`.  This can cause
enormous memory useage when importing large datasets.

Where this method is invoked the results are either a) disgarded or
b) used to calculate the `CID`s of the nodes which is then bad for memory
*and* CPU usage.

This PR removes the buffering and `CID` recalculating for a nice speedup
when adding large datasets.

fixes #2310
@achingbrain achingbrain requested a review from alanshaw August 20, 2019 17:04
@alanshaw
Copy link
Member

Holy wow 3s faster than go-ipfs!

.aegir.js Outdated
@@ -8,7 +8,7 @@ const ipfsdServer = IPFSFactory.createServer()
const preloadNode = MockPreloadNode.createNode()

module.exports = {
bundlesize: { maxSize: '689kB' },
bundlesize: { maxSize: '756KB' },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is change is needed, In this branch I'm getting the following:

⨎ npx aegir build --bundlesize
Child
       1466 modules
Child
       1466 modules
 PASS  ./dist/index.min.js: 682.3KB < maxSize 756KB (gzip) 

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will revert since we're disabling it anyway

@@ -25,7 +25,7 @@ jobs:
include:
- stage: check
script:
- npx aegir build --bundlesize
# - npx aegir build --bundlesize
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self we need a sister PR that re-enables this after it is merged.

q.push({ cid })
}

function getIndirectKeys (callback) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to pass the preload option from pin.ls here and then pass it to walkDag

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@achingbrain
Copy link
Member Author

Holy wow 3s faster than go-ipfs!

For certain datasets under certain conditions. Not cracking out the 🍾 just yet...

Copy link
Member

@alanshaw alanshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@alanshaw alanshaw merged commit f357c28 into master Aug 21, 2019
@alanshaw alanshaw deleted the do-not-load-dags-into-memory-while-pinning branch August 21, 2019 15:25
achingbrain added a commit that referenced this pull request Aug 22, 2019
Port of #2372 into gc branch to ease merging
achingbrain added a commit that referenced this pull request Aug 22, 2019
Port of #2372 into gc branch to ease merging
achingbrain added a commit that referenced this pull request Aug 22, 2019
Port of #2372 into gc branch to ease merging
alanshaw pushed a commit that referenced this pull request Aug 23, 2019
alanshaw pushed a commit that referenced this pull request Aug 26, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

jsipfs add -Qr <directory-containing_1206 _files> doesn't work
2 participants