Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v5.2.0 built from source runs out of memory #5970

Closed
alecalve opened this issue Jun 20, 2024 · 22 comments
Closed

v5.2.0 built from source runs out of memory #5970

alecalve opened this issue Jun 20, 2024 · 22 comments

Comments

@alecalve
Copy link

Description

We build a Docker image from the Lighthouse v5.2.0 Dockerfile with very minor changes:

  • adding curl inside the image directly
  • some changes relating to how the binary is run

The binary itself is built the same way as in this repo's Dockerfile, using make and the default values for FEATURES, PROFILE, etc..

diff --git a/Dockerfile b/Dockerfile
index e0c48699b..da033a4bf 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -13,6 +13,21 @@ FROM ubuntu:22.04
 RUN apt-get update && apt-get -y upgrade && apt-get install -y --no-install-recommends \
   libssl-dev \
   ca-certificates \
-  && apt-get clean \
-  && rm -rf /var/lib/apt/lists/*
+  curl && \
+  apt-get clean && \
+  rm -rf /var/lib/apt/lists/*
+
 COPY --from=builder /usr/local/cargo/bin/lighthouse /usr/local/bin/lighthouse
+RUN useradd -m -u 1000 -s /bin/bash lighthouse
+RUN chown -R lighthouse:lighthouse /opt
+USER lighthouse
+RUN mkdir /opt/data
+WORKDIR /opt/lighthouse
+
+# Firewall configurations
+# P2P TCP, UDP
+EXPOSE 9001/TCP
+EXPOSE 9001/UDP
+# RPC
+EXPOSE 5053
+ENTRYPOINT [ "lighthouse" ]

The node is run using these arguments:

      --datadir=/opt/data
      --http
      --listen-address=0.0.0.0
      --listen-address=::
      --http-address=::
      --http-allow-origin=*
      --http-port=5052
      --target-peers=50
      --slots-per-restore-point=32
      --prune-blobs=false
      --execution-jwt=/secrets/jwtsecret
      --execution-endpoint=http://....:8551
      --port=9000
      --port6=9000
      --network=mainnet
      --disable-deposit-contract-sync
      --state-cache-size=2
      --metrics
      --metrics-address=::
      --metrics-port=8080
      --metrics-allow-origin=*

Version

The Docker image is built from v5.2.0

Present Behaviour

Once started, the application runs then finally runs out of memory after hitting the limit of 64GB we assigned to it.

This did not happen with the same modifications applied to previous versions of Lighthouse. The last one we had tested this with was v5.0.0.

Using the image you provide (sigp/lighthouse:v5.2.0) with the same arguments on the same datadir results in a reasonable, stable, memory usage.

Expected Behaviour

The application should have a stable memory footprint and not run out of memory.

@alecalve
Copy link
Author

Compiling with --features "portable,gnosis,slasher-lmdb,slasher-mdbx,jemalloc" --profile "maxperf" makes the application not run out of memory

@jimmygchen
Copy link
Member

Thanks for raising this! We can take a look into this.

Could you try removing or increasing the value for --state-cache-size=2?

I haven't tried testing using a low state cache size since v5.2.0, but with the introduction of tree-states, state caches are now 32x cheaper and we recommend to adjust this to a higher number (64 if you want the allocate equivalent of 2 state caches in 5.1.3), or just use the default of 128. Using a low state cache size may result in more cache misses and copying, and potentially more memory usage elsewhere, although I'm not very sure and need to take a deeper look.

@alecalve
Copy link
Author

Ah yes, the low value for --state-cache-size was my initial attempt at trying to get memory under control until I realized it came from a much deeper issue. I'll remove it but I can say that the issue was occuring with the default value for that setting.

@dapplion
Copy link
Collaborator

Do you have metrics tracking the memory usage of this instance? Would be helpful to see if it's a linear or spiky increase and at what rate. Also please send us debug logs if possible to dive deeper

@alecalve
Copy link
Author

Here's what memory usage was, it was OOM-killed each time it exceeded 64G:

image

I don't have debug logs available.

@michaelsproul
Copy link
Member

I wonder if maxperf is causing the compiler to find some optimisation that prevents an OOM. I will try to repro and profile the mem usage of a release build.

@michaelsproul
Copy link
Member

I couldn't repro the OOM with LH v5.2.0 compiled from source with release profile. I ran for about 2 hours and mem didn't bump above 5GB. I didn't run under Docker though.

@michaelsproul
Copy link
Member

I'm trying a long-range sync now, as one of the other OOM reports we had was a node syncing a month of history.

@michaelsproul
Copy link
Member

No luck with the long-range sync either. Mem usage didn't bump above 3GB

@chong-he
Copy link
Member

Compiling with --features "portable,gnosis,slasher-lmdb,slasher-mdbx,jemalloc" --profile "maxperf" makes the application not run out of memory

We have another report by @rz4884 on Discord that's facing the same issue, and the user confirmed that it is because he doesn't include --feature jemalloc.

@michaelsproul mentions that the Dockerfile appears to not have enabled jemalloc by default. We will fix this, thanks for reporting

centromere pushed a commit to coinmetrics/fullnode that referenced this issue Jun 26, 2024
The original patch was missing balance updates and the default features/profiles led to increase memory usage (sigp/lighthouse#5970)
@michaelsproul
Copy link
Member

Closed by #5995

@alecalve
Copy link
Author

It is happening again using 5.3.0 when building from source.

The image is built from commit d6ba8c3 using

docker build --build-arg="PROFILE=maxperf" ...

@michaelsproul
Copy link
Member

@alecalve does the resulting binary show jemalloc as the allocator in ./lighthouse --version?

@alecalve
Copy link
Author

alecalve commented Aug 23, 2024

It does:

$ lighthouse --version
Lighthouse v5.3.0-b11261e
BLS library: blst
BLS hardware acceleration: true
SHA256 hardware acceleration: true
Allocator: jemalloc
Profile: maxperf
Specs: mainnet (true), minimal (false), gnosis (false)

@michaelsproul
Copy link
Member

What's the memory usage getting to now on 5.3.0 when the OOM occurs?

It must be something other than the lack of jemalloc. Things to check:

  • Are you running a slasher?
  • If you have metrics enabled, please share a dump of curl http://localhost:5054/metrics here or via DM (@sproul on Discord). In particular it would be good to check beacon_fork_choice_process_attestation_seconds_* to rule out the issue described here: Expensive fork-choice queued attestation mutation #6206.
  • Are you running Lighthouse with any other flags, e.g. --state-cache-size, --reconstruct-historic-states, etc.
  • What's the hardware?

@alecalve
Copy link
Author

alecalve commented Aug 26, 2024

  • We are not running a slasher
  • I enabled metrics, beacon_fork_choice_process_attestation_seconds_* metrics look healthy, metrics are here
  • We run the following arguments:
      beacon_node
      --datadir=/opt/data
      --http
      --listen-address=0.0.0.0
      --listen-address=::
      --http-address=::
      --http-allow-origin=*
      --http-port=5052
      --checkpoint-sync-url=https://mainnet-checkpoint-sync.attestant.io/
      --reconstruct-historic-states
      --target-peers=50
      --slots-per-restore-point=256
      --historic-state-cache-size=1
      --prune-blobs=false
      --execution-jwt=/secrets/jwtsecret
      --execution-endpoint=http://geth-1-cluster:8551
      --port=9000
      --port6=9000
      --network=mainnet
      --metrics
      --disable-deposit-contract-sync

The node has finished reconstructing states, we do see a lot of State cache missed logs though.

  • Hardware wise, it's in a k8s env, running on Linux. We allocate 32GB of RAM to that container.

@michaelsproul
Copy link
Member

michaelsproul commented Aug 26, 2024

we do see a lot of State cache missed logs though.

This sounds like the issue.

The state cache referred to in this log is for the unfinalized portion of the chain. It shouldn't frequently miss with the default --state-cache-size of 128. It doesn't have anything to do with state reconstruction.

Can you post the output of curl http://localhost:5052/lighthouse/database/info | jq? I want to check if the split.slot is advancing normally and keeping the unfinalized part of the DB a manageable size.

Can you also provide some info on what's making state queries? How many requests per second, are they made concurrently, etc?

@alecalve
Copy link
Author

  "split": {
    "slot": "9760096",
    "state_root": "0x19325b996b812c1c1d11728a0481f1c333c224e0fa6b10ebd9aefeddb34d9f44",
    "block_root": "0x52ea319a5ff08c1ca9914952690dff649c59808028cee0e450c50274faad04dc"
  }

But the node is way beyond that slot:

Aug 26 08:04:17.000 INFO Synced                                  slot: 9819619, block:    …  empty, epoch: 306863, finalized_epoch: 306861, finalized_root: 0x732f…d236, exec_hash: 0x3837…f93b (verified), peers: 13, service: slot_notifier

Full output:

{
  "schema_version": 21,
  "config": {
    "slots_per_restore_point": 256,
    "slots_per_restore_point_set_explicitly": true,
    "block_cache_size": 5,
    "state_cache_size": 128,
    "historic_state_cache_size": 1,
    "compact_on_init": false,
    "compact_on_prune": true,
    "prune_payloads": true,
    "prune_blobs": false,
    "epochs_per_blob_prune": 1,
    "blob_prune_margin_epochs": 0
  },
  "split": {
    "slot": "9760096",
    "state_root": "0x19325b996b812c1c1d11728a0481f1c333c224e0fa6b10ebd9aefeddb34d9f44",
    "block_root": "0x52ea319a5ff08c1ca9914952690dff649c59808028cee0e450c50274faad04dc"
  },
  "anchor": null,
  "blob_info": {
    "oldest_blob_slot": "9483873",
    "blobs_db": true
  }
}

@alecalve
Copy link
Author

RPC wise, the only users are some L2 nodes, I don't have an insight into how frequently they query the node.

@michaelsproul
Copy link
Member

But the node is way beyond that slot

This is the issue. The state migration must be failing.

Do you see an error log like:

WARN Block pruning failed

There was an old issue prior to v4.6.0 that could cause DB corruption similar to this:

But I'm guessing that seeing as it happened in