Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: compute leaf hashes in the Push method #172

Merged
merged 21 commits into from
Apr 6, 2023
Merged

Conversation

staheri14
Copy link
Contributor

@staheri14 staheri14 commented Mar 31, 2023

Overview

Closes #108 and #143

Checklist

  • New and updated code has appropriate documentation
  • New and updated code has new and/or updated testing
  • Required CI checks are passing
  • Visual proof for any user facing features like CLI or documentation updates
  • Linked issues closed with keywords

@staheri14 staheri14 changed the title feat!: compute leaf hashes when they are pupshed feat!: compute leaf hashes in the Push method Mar 31, 2023
@staheri14 staheri14 self-assigned this Mar 31, 2023
@codecov
Copy link

codecov bot commented Mar 31, 2023

Codecov Report

Merging #172 (5b53987) into master (1884bfb) will decrease coverage by 0.43%.
The diff coverage is 80.00%.

@@            Coverage Diff             @@
##           master     #172      +/-   ##
==========================================
- Coverage   96.19%   95.76%   -0.43%     
==========================================
  Files           5        5              
  Lines         525      519       -6     
==========================================
- Hits          505      497       -8     
- Misses         12       13       +1     
- Partials        8        9       +1     
Impacted Files Coverage Δ
nmt.go 97.94% <66.66%> (-1.08%) ⬇️
hasher.go 97.54% <100.00%> (+0.08%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@staheri14 staheri14 changed the title feat!: compute leaf hashes in the Push method feat: compute leaf hashes in the Push method Mar 31, 2023
@staheri14 staheri14 marked this pull request as ready for review March 31, 2023 21:34
@staheri14 staheri14 added the enhancement New feature or request label Mar 31, 2023
Copy link
Member

@evan-forbes evan-forbes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only two questions, otherwise LGTM this makes sense.

iirc, we have some benchmarks in this repo, have we compared this change against the old yet?

nmt.go Outdated
Comment on lines 415 to 416
// compute the leaf hash
res := n.treeHasher.MustHashLeaf(namespacedData)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not bubble the error here if there ever is one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two reasons why the error is not being bubbled up:

  • In line 410, the data validity is checked through the validateAndExtractNamespace function call. As a result, the HashLeaf function call cannot produce an error. To handle this scenario, the MustHashLeaf function is introduced, which assumes that the data passed to it has already been validated. This approach is similar to the current situation and avoids any unnecessary error handling.

  • If we were to use HashLeaf and bubble up the error, we would need to introduce an if block in our code i.e., the Push method. However, this if block would not be covered in our tests, leading to a reduction in test coverage percentage. Therefore, I have opted not to bubble up the error in this situation.

Please let me know if this addresses your question.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that makes sense, but documenting it in the code for posterity would be a good addition imo. we could also just ignore the error if its impossible (and we have comments), but in general I think its always best to bubble even if we don't strictly need to. Its one less thing the reader or future contributors have to think about.

I'm not blocking on it though! will defer to the author 🙂

Copy link
Contributor Author

@staheri14 staheri14 Apr 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point, replaced MustHashLeaf with HashLeaf and bubbled up the error, please see d60367b
With this change, the code coverage percentage decreases below the set threshold, and there is no way to develop a test to cover the line that captures the returned error. In such situations, what should we do? Should we still merge the PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, 100% merge imo

rootulp
rootulp previously approved these changes Apr 3, 2023
Copy link
Collaborator

@rootulp rootulp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@staheri14
Copy link
Contributor Author

staheri14 commented Apr 3, 2023

cc: @evan-forbes
Please find the benchmark comparison below. It's worth noting that the change introduced by this PR doesn't affect the ComputeRoot() benchmark test result. This is because the time measurement captures the total time taken to push N items and then compute the root. Before this change, the items were hashed inside the Root() method, whereas now they are hashed within the Push() method. As a result, the total number of hashes remains unchanged, and therefore the performance for this particular benchmark test does not improve.

goos: darwin
goarch: arm64
pkg: github.com/celestiaorg/nmt
                          │   master.txt   │              hash-and-push.txt               │
                          │   sec/op    │   sec/op     vs base               │
ComputeRoot/64-leaves-10    36.28µ ± 1%   38.10µ ± 0%  +5.00% (p=0.000 n=10)
ComputeRoot/128-leaves-10   72.23µ ± 0%   75.29µ ± 0%  +4.24% (p=0.000 n=10)
ComputeRoot/256-leaves-10   146.4µ ± 1%   152.7µ ± 1%  +4.30% (p=0.000 n=10)
geomean                     72.66µ        75.94µ       +4.51%

evan-forbes
evan-forbes previously approved these changes Apr 4, 2023
@staheri14 staheri14 dismissed stale reviews from evan-forbes and rootulp via d60367b April 4, 2023 20:05
@staheri14
Copy link
Contributor Author

@rootulp Can you please review the recent changes 🙏🏼

@staheri14 staheri14 merged commit f2e4038 into master Apr 6, 2023
@staheri14 staheri14 deleted the hash-and-push branch April 6, 2023 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat: compute leaf hashes immediately after data items are pushed
3 participants