lazy-adr: Add Data Availability library #170

liamsi · 2021-03-01T10:31:36Z

Description

This PR will contain the "DA lib" accompanied with an ADR describing the design as well as providing some context.

Rendered ADR: https://github.com/lazyledger/lazyledger-core/blob/ismail/da_lib_adr/docs/lazy-adr/adr-002-ipds-da-sampling.md

related to: #85, #163

- provide context - notes / todos

liamsi · 2021-03-02T08:42:20Z

docs/lazy-adr/adr-002-ipds-da-sampling.md

+### A Note on IPFS/IPLD
+
+In IPFS all data is _content addressed_ which basically means the data is identified by its hash.
+Particularly, in the LazyLedger case, the root CID identifies the Namespaced Merkle tree including all its contents (inner and leaf nodes).
+This means that if a `GetLeafData` request succeeds, the retrieved leaf data is in fact the leaf data in the tree.
+We do not need to additionally verify Merkle proofs per leaf as this will essentially be done via IPFS on each layer while
+resolving and getting to the leaf data.
+
+> TODO: validate this assumption and link to code that shows how this is done internally


Or do we want to explicitly verify proofs either way? To not rely on the fact that ipfs in combination with our plugin handles this correctly?

cc @musalbas @adlerjohn

I think it's safer and more idiot-proof if GetLeafDataonly succeeds if the proof is valid. Anyway, I thought it would only succeed with IPFS is the proof is valid, with the custom hasher?

Anyway, I thought it would only succeed with IPFS is the proof is valid, with the custom hasher?

Yes, that is my understanding as well. For every retrieved leaf, the proof nodes should also be resolved and validated on its path down.

docs/lazy-adr/adr-002-ipds-da-sampling.md

liamsi · 2021-03-02T09:39:12Z

docs/lazy-adr/adr-002-ipds-da-sampling.md

+// The context can be used to provide a timeout.
+// TODO: Should there be a constant = lower bound for #samples
+func ValidateAvailability(
+    ctx contex.Context,


We should consider moving these Context objects lower down the stack too.

musalbas

Nice work!

docs/lazy-adr/adr-002-ipds-da-sampling.md

musalbas · 2021-03-02T14:57:59Z

docs/lazy-adr/adr-002-ipds-da-sampling.md

+````go
+// This constructs an IPFS node instance
+node, _ := core.NewNode(ctx, nodeOptions)
+// This attaches the Core API to the constructed node


What happens if you don't attach a core API?

You could also pass around the node object directly, or simply the DAG field's ipld.DAGService. In the former case it would just be less pluggable (as we are passing around a concrete object instead of an interface).

docs/lazy-adr/adr-002-ipds-da-sampling.md

musalbas · 2021-03-02T15:06:31Z

docs/lazy-adr/adr-002-ipds-da-sampling.md

+// to process the leaf data the moment it was validated.
+// The context can be used to provide a timeout.
+// TODO: Should there be a constant = lower bound for #samples
+func ValidateAvailability(


This could block for a few minutes.

Yeah, this is definitely something that should be done asynchronously.

docs/lazy-adr/adr-002-ipds-da-sampling.md

adlerjohn · 2021-03-02T16:57:01Z

docs/lazy-adr/adr-002-ipds-da-sampling.md

+// Specifically all steps of the the protocol described in section
+// _5.2 Random Sampling and Network Block Recovery_ are carried out.
+//
+// In more detail it will first create numSamples random unique coordinates.


Note: add that the domain for coordinates can excludes parts of the original data square (and extended rows!) based on the number of "real" shares in a block, i.e. the availableDataOriginalSharesUsed field in the header https://github.com/lazyledger/lazyledger-specs/blob/10732d7a258a0b64dfccf96fd863830faca73ce3/specs/data_structures.md#header

adlerjohn · 2021-03-02T16:58:07Z

docs/lazy-adr/adr-002-ipds-da-sampling.md

+// to process the leaf data the moment it was validated.
+// The context can be used to provide a timeout.
+// TODO: Should there be a constant = lower bound for #samples
+func ValidateAvailability(


Yeah, this is definitely something that should be done asynchronously.

docs/lazy-adr/adr-002-ipds-da-sampling.md

evan-forbes · 2021-03-02T21:07:59Z

docs/lazy-adr/adr-002-ipds-da-sampling.md

+// Note, that this method could also return the row and column roots.
+// Tha caller is responsible for making sure that the leaves are sorted by namespace ID.
+// The data will be pinned by default.
+func PutLeaves(ctx contex.Context, namespacedLeaves [][]byte) error


If we're going to be passing the IPFS node object, then I think PutLeaves will need a format.NodeAdder argument, as it will not have access to the IPFS node object.

evan-forbes · 2021-03-02T21:23:32Z

I'm not sure if defining the API perfectly is in the scope of this ADR, but I think the API will have to change to accommodate some form of access to the IPFS node object if we're passing it around. If we use one of the alternatives mentioned, then the API specified still might have to be modified slightly, but that depends on which alternative we end up going with.

Co-authored-by: John Adler <adlerjohn@users.noreply.github.com>

…yledger-core into ismail/da_lib_adr

evan-forbes · 2021-03-04T05:16:02Z

I posted a draft for the writing portion of this ADR at #178. The PR incorporates PutLeaves into PutBlock, as that allows for more optimization in batch adding the ipld nodes to the dag. It also adds an IPFS core object as an argument to PutBlock. Notably, the PR requires the erasure data be computed twice, as it is not currently cached, but that can change.

liamsi · 2021-03-04T09:46:54Z

I'm not sure if defining the API perfectly is in the scope of this ADR

I mostly created this ADR such that we have a basis on which to discuss API alternatives. If no major shortcomings in the ADR, we can use it as a blueprint to start the implementation. It is definitely not complete and it won't be until we wrapped up the implementation. I do not think it is realistic to define the APIs perfectly without drafting at least "spikes" / "tracer bullet" implementations.

Actually, my suggestion is to merge this if it is sound and sane from a high-level pov. Because then, you and I can make the modifications on portions of the ADR while we work on the implementation in separate branches. The moment, we want to merge a PR that implements a part of this, we need to make sure the ADR matches the implementation API (and we changed the status from Proposed to Implemented). At least that was the idea. I'm open to other suggestions.

evan-forbes

Sounds good, any changes needed during implementation can be made then. Two 👍 from me. 🚀

liamsi · 2021-03-05T10:14:50Z

OK, let's merge this then! I've captured a bunch of smaller todos here: #179

liamsi added 2 commits March 1, 2021 11:23

Commit work in progress:

b767127

- provide context - notes / todos

Add high-level design considerations

1e95a71

liamsi force-pushed the ismail/da_lib_adr branch from f661921 to 1e95a71 Compare March 1, 2021 11:26

liamsi requested a review from evan-forbes March 1, 2021 14:07

liamsi added 4 commits March 1, 2021 15:09

Add ipfs-agnostic part of the library

85c35f9

improve readability in rendered version

d988214

slightly restructure and add a PutLeaves and GetLeafData method (wip)

de21e31

Add comment about ipfs and some minor udpates

3c280cf

liamsi commented Mar 2, 2021

View reviewed changes

liamsi requested review from adlerjohn and musalbas March 2, 2021 08:44

liamsi added 2 commits March 2, 2021 10:31

Refactored the main API and moved putting the leaves down the stack

697f186

minor language glichtes and a clarification

bef6ca4

liamsi commented Mar 2, 2021

View reviewed changes

docs/lazy-adr/adr-002-ipds-da-sampling.md Outdated Show resolved Hide resolved

liamsi commented Mar 2, 2021

View reviewed changes

liamsi changed the title ~~Add DA library~~ Add DA library ADR Mar 2, 2021

liamsi marked this pull request as ready for review March 2, 2021 09:54

liamsi requested a review from tac0turtle as a code owner March 2, 2021 09:54

liamsi added 4 commits March 2, 2021 13:00

Add some pros/cons

23d0ae4

break up changes into smaller packages for the sake of reviewability

b7f31e2

mention batch adding in context, too

b40785b

minor clarification

d34fa08

musalbas reviewed Mar 2, 2021

View reviewed changes

adlerjohn reviewed Mar 2, 2021

View reviewed changes

evan-forbes reviewed Mar 2, 2021

View reviewed changes

liamsi and others added 4 commits March 3, 2021 02:50

Apply a batch of suggestions

4cf8b75

Co-authored-by: John Adler <adlerjohn@users.noreply.github.com>

Apply suggestions from code review

69ea2ba

Co-authored-by: John Adler <adlerjohn@users.noreply.github.com>

Merge branch 'ismail/da_lib_adr' of https://github.com/LazyLedger/laz…

8236308

…yledger-core into ismail/da_lib_adr

Address some more review feedback

795056e

evan-forbes mentioned this pull request Mar 4, 2021

Use the IPFS core object to post block data during proposal #178

Merged

liamsi added 2 commits March 4, 2021 09:45

add links

bd45d7e

add more links

dc84db8

liamsi added 3 commits March 4, 2021 12:18

permalinks as requested

5deebfc

link Header

099f18c

less abstract description of "store the block in the network"

2ff8fc9

liamsi changed the title ~~Add DA library ADR~~ lazy-adr: Add Data Availability library Mar 4, 2021

evan-forbes approved these changes Mar 4, 2021

View reviewed changes

liamsi mentioned this pull request Mar 5, 2021

Minor followups on adr 002 #179

Closed

5 tasks

liamsi merged commit ad4ee87 into master Mar 5, 2021

liamsi deleted the ismail/da_lib_adr branch March 5, 2021 10:15

liamsi mentioned this pull request Mar 9, 2021

Implement reading from IPLD merkle dag #194

Closed

4 tasks

evan-forbes mentioned this pull request Mar 11, 2021

Move and refactor the PutBlock method #196

Closed

This was referenced Mar 16, 2021

Sampling and data extraction #35

Closed

Switch to LL-specific plugin or run LL node celestiaorg/ipld-plugin-experiments#10

Closed

evan-forbes mentioned this pull request Apr 6, 2021

Implement ValidateAvailability per ADR 002 #269

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lazy-adr: Add Data Availability library #170

lazy-adr: Add Data Availability library #170

liamsi commented Mar 1, 2021 •

edited

Loading

liamsi Mar 2, 2021

musalbas Mar 2, 2021

liamsi Mar 3, 2021

liamsi Mar 2, 2021 •

edited

Loading

musalbas left a comment

musalbas Mar 2, 2021

liamsi Mar 3, 2021

musalbas Mar 2, 2021

adlerjohn Mar 2, 2021

adlerjohn Mar 2, 2021

adlerjohn Mar 2, 2021

evan-forbes Mar 2, 2021

evan-forbes commented Mar 2, 2021 •

edited

Loading

evan-forbes commented Mar 4, 2021 •

edited

Loading

liamsi commented Mar 4, 2021 •

edited

Loading

evan-forbes left a comment •

edited

Loading

liamsi commented Mar 5, 2021

lazy-adr: Add Data Availability library #170

lazy-adr: Add Data Availability library #170

Conversation

liamsi commented Mar 1, 2021 • edited Loading

Description

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

liamsi Mar 2, 2021 • edited Loading

Choose a reason for hiding this comment

musalbas left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

evan-forbes commented Mar 2, 2021 • edited Loading

evan-forbes commented Mar 4, 2021 • edited Loading

liamsi commented Mar 4, 2021 • edited Loading

evan-forbes left a comment • edited Loading

Choose a reason for hiding this comment

liamsi commented Mar 5, 2021

liamsi commented Mar 1, 2021 •

edited

Loading

liamsi Mar 2, 2021 •

edited

Loading

evan-forbes commented Mar 2, 2021 •

edited

Loading

evan-forbes commented Mar 4, 2021 •

edited

Loading

liamsi commented Mar 4, 2021 •

edited

Loading

evan-forbes left a comment •

edited

Loading