Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include namespaces in erasured data #230

Closed
evan-forbes opened this issue Mar 17, 2021 · 4 comments · Fixed by #235
Closed

Include namespaces in erasured data #230

evan-forbes opened this issue Mar 17, 2021 · 4 comments · Fixed by #235
Assignees
Labels
C:block-recovery Component: Block recovery P:High Priority High

Comments

@evan-forbes
Copy link
Member

evan-forbes commented Mar 17, 2021

Namespaces are currently left out when computing the extended datasquare. This becomes problematic when retrieving block data and we need to repair the underlying data square, as the namespaces are not preserved. To fix this, we should include the namespaces in the erasure data, along with adding some logic to use the appropriate namespace depending on which quadrant is being used.

spec ref: #145

@evan-forbes evan-forbes added C:block-recovery Component: Block recovery P:High Priority High labels Mar 17, 2021
@evan-forbes evan-forbes self-assigned this Mar 17, 2021
@evan-forbes
Copy link
Member Author

the plan was for lazyledger-core to use the nmt wrapper (#238) to generate roots for lazyledger-core, but now it's clear that the wrapper won't actually work for an extended data square. This is due to the fact that the wrapper has to know which row or column is being pushed in order to perform logic as to which namespace should be used. semi related comment Without this functionality, we cannot repair the data square, and therefore #235 and #232 are blocked. 👀

Extra details as to why the wrapper won't work

Using the wrapper's current construction, it only works for rows and cols that start in Q0 of the extended data square. As soon as a row or col that starts in one of the other quadrants is pushed, the wrapper has no way of knowing that it should use the parity namespace for all leaves pushed. Without this information, the wrapper ends up using incorrect namespaces, and the underlying nmt tree panics.

Options

  1. modify the rsmt2d.Tree interface to indicate to the underlying tree which row or col is being pushed
  2. Use a different interface, possibly something similar to this and pass that to rsmt2d.RepairExtendedDataSquare
  3. Create an external version of rsmt2d.RepairExtendedDataSquare that doesn't require rsmt2d.Tree

(1) will likely require the smallest number of changes
(2) more changes, but we already have code written for it
(3) modest amount of changes, but all modifications will stay in lazyleder-core

ref #10
this decision will also effect #172

@liamsi
Copy link
Member

liamsi commented Mar 26, 2021

Just noting that we kinda already went down the route of (3) when computing the row and column roots.

I'm inclined to favor (3) simply because it feels like we are tried to work around the shortcomings of rsmt2d for quite a while already. Rsmt2d is simply not written in a way that has lazyledger as the main use-case in mind. My gut feeling is that we might run into yet another trap with (1) and (2):
(1) might require additional changes and (2) is a beautiful but very abstract and general approach while we should aim for the simplest possible and dedicated to the use-case in ll-core right now.
(3) can be seen as the simplest way to test out what concrete interfaces will be needed for rsmt2d (celestiaorg/rsmt2d#12) and to fully understand what route to go down later (e.g. either (1) or (2) or something else even) without any risks. Also, we can clean up the code at the same time as we fo about (3). Another benefit is that with (3) we can optimize for this particular use-case.

Either way, I'll try to gather all pros and cons and edit this comment here to list all those later.

@liamsi
Copy link
Member

liamsi commented Mar 26, 2021

modify the rsmt2d.Tree interface to indicate to the underlying tree which row or col is being pushed (1)

Pros

  • sounds like a simple and straight-forward change

Cons

  • (higher) risk that we are overlooking something (compared to other options)

Neutral

Use a different interface, possibly something similar to this, and pass that to rsmt2d.RepairExtendedDataSquare (2)

Pros

  • API change that enables using the API in other ways
  • code is already written to a large extent

Cons

Neutral

Create an external version of rsmt2d.RepairExtendedDataSquare that doesn't require rsmt2d.Tree (3)

Pros

  • full control on API -> no need to think about everything in a broader context, e.g. how else rsmt2d might be used outside of lazyledger-core
  • could optimize it for this particular use-case here
  • less of a need to coordinate dependencies across repositories
  • a chance to clean up the code while at it (implement lazy root computation rsmt2d#18 (comment))
  • can very clearly inform how the API of rsmt2d should look like (at least from the perspective of lazyledger (Clarify public API rsmt2d#12)

Cons

  • more work / more time consuming potentially

Neutral

  • we partly went down that route already; that is certainly not a good reason to favour this but you could argue in favor of consistency one way or the other (consistently use rsmt2d or consistently rip out the parts that we need here)

There might be other options that we are missing here.

@liamsi
Copy link
Member

liamsi commented Mar 26, 2021

We've decided to go with option (1) as it hopefully is a very simple change that only requires a few lines of code. If that still poses new issues (e.g. when trying to repair the eds), we will reconsider (2) and (3).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C:block-recovery Component: Block recovery P:High Priority High
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants