Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use IPLD Dag instead of CoreAPI #352

Merged
merged 5 commits into from
May 26, 2021
Merged

Use IPLD Dag instead of CoreAPI #352

merged 5 commits into from
May 26, 2021

Conversation

Wondertan
Copy link
Member

@Wondertan Wondertan commented May 23, 2021

The PR's main goal is to facilitate everything related to IPFS usage by:

  1. Using IPLD Merkle DAG interfaces instead of CoreAPI.
  2. Reworking Data retrieval logic to be based solely on IPLD Dag(optimization)
  3. Use of Dag Sessions(optimization)
  4. Reworking and cleaning some tests

DAS timings before 2 above optimizations:

I[2021-05-23|23:12:11.531] Successfully finished DAS sampling           height=629 numSamples=15 elapsedtime=1.015415079s
I[2021-05-23|23:12:11.768] Starting Data Availability sampling          height=630 numSamples=15 squareWidth=16
I[2021-05-23|23:12:12.690] Successfully finished DAS sampling           height=630 numSamples=15 elapsedtime=921.977083ms
I[2021-05-23|23:12:13.060] Starting Data Availability sampling          height=631 numSamples=15 squareWidth=16
I[2021-05-23|23:12:14.801] Successfully finished DAS sampling           height=631 numSamples=15 elapsedtime=1.740647283s
I[2021-05-23|23:12:15.003] Starting Data Availability sampling          height=632 numSamples=15 squareWidth=16

DAS timings after optimizations:

I[2021-05-23|23:13:44.353] Successfully finished DAS sampling           height=648 numSamples=15 elapsedtime=239.112542ms
I[2021-05-23|23:13:44.485] Starting Data Availability sampling          height=649 numSamples=15 squareWidth=16
I[2021-05-23|23:13:44.719] Successfully finished DAS sampling           height=649 numSamples=15 elapsedtime=234.11647ms
I[2021-05-23|23:13:44.853] Starting Data Availability sampling          height=650 numSamples=15 squareWidth=16
I[2021-05-23|23:13:45.095] Successfully finished DAS sampling           height=650 numSamples=15 elapsedtime=242.104799ms
I[2021-05-23|23:13:45.227] Starting Data Availability sampling          height=651 numSamples=15 squareWidth=16

NOTE: Heights in the run with optimization was not DASed before, so light-client didn't have common data before.

@codecov-commenter
Copy link

Codecov Report

❗ No coverage uploaded for pull request base (ismail/light-mvp@dbea20c). Click here to learn what that means.
The diff coverage is n/a.

@@                 Coverage Diff                 @@
##             ismail/light-mvp     #352   +/-   ##
===================================================
  Coverage                    ?   61.85%           
===================================================
  Files                       ?      262           
  Lines                       ?    22930           
  Branches                    ?        0           
===================================================
  Hits                        ?    14184           
  Misses                      ?     7251           
  Partials                    ?     1495           

p2p/ipld/read.go Outdated
Comment on lines 248 to 257
total /= 2
if leaf < total {
root = lnks[0].Cid
} else {
root, leaf = lnks[1].Cid, leaf-total
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mind adding some documentation somewhere? doesn't have to be anything too elaborate, but I think some simple guidance would make a big difference. Particularly for readers who are not familiar with our plugin.

Copy link
Member Author

@Wondertan Wondertan May 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Also, that’s no the final form of the PR. I think I will close ot and rewrite afresh basing on the master, but that requires #323 to be merged first.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

merged :-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some documentation now, hope it will help understand the logic

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! Just saw that after reviewing.

Use of Dag Sessions(optimization)

I think I understand this now. Although it's not clear to me how it works under the hood. But it makes sense intuitively.

Base automatically changed from ismail/light-mvp to master May 24, 2021 09:55
@liamsi

This comment has been minimized.

@Wondertan Wondertan changed the title Rework and optimize GetLeafData for Light Client Use IPLD Dag instead of CoreAPI May 26, 2021
@Wondertan
Copy link
Member Author

@liamsi, @evan-forbes, I decided not to close this PR and instead extend it with the requested changes for CoreAPI removal

@Wondertan
Copy link
Member Author

In follow-up PR I will go even further and will propose our custom interface for network DataAvailavility which works over DAG

@Wondertan Wondertan force-pushed the hlib/mvp-experiments branch 2 times, most recently from 08e51ea to 631ab19 Compare May 26, 2021 19:58
Copy link
Member

@liamsi liamsi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Left some suggestions and questions.

@@ -9,6 +9,7 @@ import (
"testing"
"time"

mdutils "github.com/ipfs/go-merkledag/test"
Copy link
Member

@liamsi liamsi May 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use another alias? mdutils sounds like markdown utils. Suggestion: dagutils? dagtest?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a default package name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I won't change that as that's actually temporary thing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Later in block propagation PR instead of just Mock I will use Mock with Mock networking

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I won't spend time for this renaming

Copy link
Member

@liamsi liamsi May 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a default package name.

But what does that even mean? Is the package aliased like this in other contexts?

light/client.go Outdated Show resolved Hide resolved
p2p/ipld/read.go Outdated Show resolved Hide resolved
p2p/ipld/read_test.go Show resolved Hide resolved
p2p/ipld/read_test.go Show resolved Hide resolved
p2p/ipld/read.go Outdated
Comment on lines 235 to 239
// GetLeafData fetches and returns the raw leaf.
// It walks down the IPLD NMT tree until it finds the requested one.
func GetLeaf(ctx context.Context, dag format.NodeGetter, root cid.Cid, leaf, total uint32) (format.Node, error) {
// request the node
nd, err := dag.Get(ctx, root)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the new GetLeaf is much simpler! I like it a lot 👍

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liamsi do we need to update ADR002 with this slight api change?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it would be good if the ADR would match the implementation's APIs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so, but also I am going to propose an elegant PR soon. If it's get accepted, updating this will make more sense

@liamsi
Copy link
Member

liamsi commented May 26, 2021

(from CI)

race: limit on 8128 simultaneously alive goroutines is exceeded, dying
FAIL github.com/lazyledger/lazyledger-core/p2p/ipld 16.744s

This is why the race detector hack was there btw.

@Wondertan
Copy link
Member Author

Wondertan commented May 26, 2021

@liamsi, I guessed that removing Mock IPFS will decrease the number of goroutines to remove the hack. WIll try something else to fix that

t.Run(fmt.Sprintf("%s size %d", tc.name, tc.squareSize), func(t *testing.T) {
// if we're using the race detector, skip some large tests due to time and
// concurrency constraints
if racedetector.IsActive() && tc.squareSize > 8 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To save you some time and to unblock this PR: we can either revert the changes regarding the racedetector, or, we simply skip all tests with a squareSize > 8.

Copy link
Member

@evan-forbes evan-forbes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two 👍 from me! all of these refactors and optimizations are really adding up! ✨

@Wondertan
Copy link
Member Author

Wondertan commented May 26, 2021

@liamsi, @evan-forbes, after looking deeply into the reasons why the issue above happens, I conclude that we have a goroutine leak that we should fix. With that hack, we just solely ignored the issue, so I don't think taking the hack back makes sense.

We must limit the amount of spawned:

go sc.retrieveShare(rootCid, true, row, col, dag)

@Wondertan
Copy link
Member Author

I would like to handle this myself as part of this PR, but I am really tired to do that today and it is midnight my time

@evan-forbes
Copy link
Member

evan-forbes commented May 26, 2021

we don't have to merge this tonight. I'll try limiting the goroutines on a separate branch and see what happens

@Wondertan
Copy link
Member Author

Ok, but I already pushed temporary hack to unblock this

@liamsi
Copy link
Member

liamsi commented May 26, 2021

While it makes perfect sense to limit that number of go-routines spawned in that method, it's not a go-routine leak. Also note that the error only kicks in when the race-detector is active and the code works fine even with larger blocks without the race detector though.

I think, we can skip that test for now for larger blocks with a note that we should limit the number of go-routines spawned (I'll open an issue). If possible, let's merge this and handle the issue in a separate PR.

@Wondertan
Copy link
Member Author

Wondertan commented May 26, 2021

it's not a go-routine leak

That's true! Goruitne leak is when goroutine hangs out infinitely without returning, e.g. not calling cancel() for context will introduce one.

My original intuition was that spawning tonnes of routines are like leaking something. Obviously, it's wrong, affects performance, and should be avoided. Idk how to call that properly btw.

Copy link
Member

@liamsi liamsi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work! 👍🏼 🚀

@Wondertan Wondertan merged commit f46cbc6 into master May 26, 2021
@Wondertan Wondertan deleted the hlib/mvp-experiments branch May 26, 2021 21:13
@liamsi liamsi mentioned this pull request May 26, 2021
cmwaters pushed a commit that referenced this pull request Mar 13, 2023
by moving the `codecov.yml` file from .github the root folder.

---

#### PR checklist

- [x] Tests written/updated
- [x] Changelog entry added in `.changelog` (we use [unclog](https://github.com/informalsystems/unclog) to manage our changelog)
- [x] Updated relevant documentation (`docs/` or `spec/`) and code comments

(cherry picked from commit c906604)

Co-authored-by: Lasaro <lasaro@informal.systems>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants