Skip to content
This repository has been archived by the owner on Dec 6, 2022. It is now read-only.

feat: create car file from a complete graph #3

Merged
merged 6 commits into from
Jun 30, 2020
Merged

Conversation

mikeal
Copy link
Contributor

@mikeal mikeal commented Mar 20, 2020

I tried to match the style but I’m sure there’s some differences you may want me to clean up before merging.

I’m not sure if this is your preferred API, it matches what we have in a lot of the other JS libraries but isn’t exactly aligned with the API’s that are here currently. I figured I should wait to write up the docs until you had a chance to look at the PR and potentially change the API.

@mikeal mikeal requested a review from rvagg March 20, 2020 23:45
@rvagg
Copy link
Member

rvagg commented Mar 23, 2020

Seems fine to me.

Just thinking through the ways in which this is likely to expand so we don't end up stepping on our own toes. There are two: multiple roots and selectors.

As comparison with go-car that can now handle both situations with traversal, the new SelectiveCar effectively does this:

car = selectiveCar(dataSource, [ { root: cid, selector: ... }, ... ])
car.write(outputStream)

Not necessarily a great API to follow but not too far away from what you have now.

If we put root last in the args, then we get to make it varargs in the future if we want and not require a strict array. This could be expanded in the future to take a selector too, but I suppose we could just do a CID.isCID(root) to see whether it's a root or a {root,selector} pair. Then it's just the name that would be awkward. Is it worth making the name more future proof somehow now or just breaking it in the future? selectiveCar() wouldn't terrible, as long as it's documented that you're "selecting all".

car.js Outdated
async function traverseBlock (block, get, car, seen = new Set()) {
const cid = await block.cid()
await car.put(cid, block.encodeUnsafe())
seen.add(cid.toString('base64'))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't work with CIDv0 unfortunately, maybe just a toString() here and line 144?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so, the way CID.toString() works is that it caches the string type it was instantiated with and uses that as the default. this was done for perf and for some backwards compatibility, but it makes it problematic when using toString() as a cache key because it isn’t guaranteed to be consistent 😕

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what do? ipfs/js-ipfs#2953 doesn't actually work because of this, I'm using your branch but manually edited this fie to remove 'base64' here and below, otherwise it blows up as soon as it encounters a CIDv0. Do we need to if/else on the version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can just switch to explicit ‘base58’ instead. I can’t think of anything it would break.

@mikeal
Copy link
Contributor Author

mikeal commented Mar 26, 2020

I just pushed a fix. I had forgotten that raw blocks don’t have a reader (by design) so you need to filter on the codec before asking for a reader.

car.js Outdated
}
}

async function completeGraph (root, get, car) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thoughts on moving root to last so we can ...root at some point?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i ended up having to add an optional parameter for configurable concurrency.

car.js Outdated
seen.add(cid.toString('base58btc'))
if (cid.codec === 'raw') return
const reader = block.reader()
const missing = link => !seen.has(link.toString('base58btc'))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor style nit that grates on me is arrow functions without parens. (link) => ... to make it crystal clear what's going on here

car.js Show resolved Hide resolved
@rvagg rvagg force-pushed the complete-graph branch 2 times, most recently from 04ded7b to 6ce7912 Compare June 30, 2020 02:26
@rvagg
Copy link
Member

rvagg commented Jun 30, 2020

added docs, updated deps and fixed up coverage, made some minor style tweaks. This is ready to land as long as I can get Actions to work on it.

@rvagg rvagg merged commit 0a3654e into master Jun 30, 2020
@rvagg rvagg deleted the complete-graph branch June 30, 2020 03:14
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants