Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BCI-3668 Optimise HeadTracker's memory usage #14130

Merged
merged 20 commits into from
Sep 23, 2024

Conversation

dhaidashenko
Copy link
Collaborator

@dhaidashenko dhaidashenko commented Aug 15, 2024

Ticket.

Previously HeadTracker performed deep copy of the canonical chain every time new block was added to avoid race condition on .Parent field, when old blocks were removed. This causes large number of allocations and as result increases CPU usage due to GC.

PR replaces deep copy with atomic pointer, which results in slower iteration through the chain, but significantly reduces number of allocations.
Benchmark results:
Iterate through whole chain:

benchstat lock_free_earliest_head_in_chain atomic_earliest_head_in_chain
goos: darwin
goarch: arm64
pkg: github.com/smartcontractkit/chainlink/v2/core/chains/evm/headtracker
                       │ lock_free_earliest_head_in_chain │   atomic_earliest_head_in_chain    │
                       │              sec/op              │   sec/op     vs base               │
EarliestHeadInChain-10                        15.21m ± 6%   15.99m ± 4%  +5.12% (p=0.005 n=10)

                       │ lock_free_earliest_head_in_chain │ atomic_earliest_head_in_chain  │
                       │               B/op               │    B/op     vs base            │
EarliestHeadInChain-10                         72.00 ± 0%   72.00 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

                       │ lock_free_earliest_head_in_chain │ atomic_earliest_head_in_chain  │
                       │            allocs/op             │ allocs/op   vs base            │
EarliestHeadInChain-10                         3.000 ± 0%   3.000 ± 0%  ~ (p=1.000 n=10) ¹
¹ all samples are equal

Add new head to the chain:

 benchstat lockfee_backfill atomic_backfill
goos: darwin
goarch: arm64
pkg: github.com/smartcontractkit/chainlink/v2/core/chains/evm/headtracker
                           │ lockfee_backfill │           atomic_backfill           │
                           │      sec/op      │   sec/op     vs base                │
Heads_SimulatedBackfill-10    4694883.0n ± 1%   801.4n ± 1%  -99.98% (p=0.000 n=10)

                           │ lockfee_backfill │          atomic_backfill           │
                           │       B/op       │    B/op     vs base                │
Heads_SimulatedBackfill-10    13295246.5 ± 0%   713.0 ± 0%  -99.99% (p=0.000 n=10)

                           │ lockfee_backfill │          atomic_backfill           │
                           │    allocs/op     │ allocs/op   vs base                │
Heads_SimulatedBackfill-10     32057.000 ± 0%   6.000 ± 0%  -99.98% (p=0.000 n=10)

@dhaidashenko dhaidashenko marked this pull request as ready for review September 5, 2024 12:27
@dhaidashenko dhaidashenko requested review from a team as code owners September 5, 2024 12:27
@dhaidashenko dhaidashenko requested review from cedric-cordenier and removed request for a team September 5, 2024 12:27
jmank88
jmank88 previously approved these changes Sep 6, 2024
}
if parent, ok := h.headsByHash[newHead.ParentHash]; ok {
if parent.Number >= newHead.Number {
return fmt.Errorf("potential cycle detected while adding newHead as child: %w", newPotentialCycleError(parent, newHead))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand this is a cycle problem in the context of Head Tracker, but this is more of a general problem as the parent's block number is higher than the child that references it. I would consider this as invalid head/data from the chain.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

newPotentialCycleError already signals that we expect block number to strictly decrease in 'child -> parent' relation. Do you suggest changing this msg?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

newPotentialCycleError error message is accurate. Perhaps I would rephrase the potential cycle... part because if the hashes are right, from a user perspective the issue would be better categorized as bad data on the block number rather than a cycle problem, but it's just me nitpicking. You can ignore this.

Comment on lines 21 to 23
func (h *headsHeap) Pop() any {
old := h.values[len(h.values)-1]
h.values = h.values[:len(h.values)-1]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the heap is structured in ascending order, shouldn't old pop the smallest/first item of the array?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

headsHeap implements heap.Interface, which requires Pop to remove the last element from the slice.

Comment on lines 139 to 141
if h == nil {
return 0
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove this since you already check for nil pointer below?

dimriou
dimriou previously approved these changes Sep 11, 2024
DylanTinianov
DylanTinianov previously approved these changes Sep 11, 2024
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine, but I wonder if we have a generic heap type anywhere already.

jmank88
jmank88 previously approved these changes Sep 12, 2024
# Conflicts:
#	core/chains/evm/logpoller/log_poller_internal_test.go
@dhaidashenko dhaidashenko requested a review from a team as a code owner September 23, 2024 10:33
@dimriou dimriou added this pull request to the merge queue Sep 23, 2024
Merged via the queue into develop with commit 31874ba Sep 23, 2024
127 of 128 checks passed
@dimriou dimriou deleted the feature/BCI-3668-optimise-ht-memory branch September 23, 2024 11:29
momentmaker added a commit that referenced this pull request Sep 24, 2024
…develop

* origin/develop: (233 commits)
  update Smoke Test TestAutomationBasic to load pre-deployed contracts if given (#14537)
  CCIP-2881  USDC Reader integration tests  (#14516)
  [TT-1624] link PR to solidity review issue (#14521)
  Fix skipped notification in E2E test workflow (#14538)
  Add regression testing for pruning bug (#14454)
  Bump owner dep (#14531)
  Fix state view (#14532)
  Deployment tooling (#14533)
  CCIP 3388 - add offramp generation (#14526)
  CCIP-3416 paginate token admin registry get all tokens call (#14514)
  Change Polygon zkEVM to use FeeHistory estimator (#14161)
  [TT-1747] fix core changeset (#14524)
  TT-1459 Use CTF actions from .github repo (#14522)
  [TT-1693] try more universal Solidity scripts (#14436)
  Remove unused workflow for e2e tests (#14520)
  BCI-3668 Optimise HeadTracker's memory usage (#14130)
  BCFR-888 LP support chains that have not reached finality yet (#14366)
  support new heads polling over http rpc client (#14373)
  Bump CTF (#14518)
  TT-1550 Migrate remaining E2E workflows to the reusable workflow (#14403)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants