Memory corruption observed when using Moka v0.9.6 #281

tatsuya6502 · 2023-06-19T11:24:02Z

Split from MaterializeInc/database-issues#93.

Note: I started looking into Moka because recently we tried using it in MaterializeInc/materialize#19614, but had to revert it because we were observing memory corruption.

Moka v0.9.6
No stack traces and core dumps are available.
The target platform (CPU arch and OS) ~~are yet unknown.~~ => Linux x86_64
moka::sync::Cache are used in async contexts.
Multiple (many?)Cache instances. (An instance per blob URL. Do not know how many)
Weigher and eviction listener (immediate mode) are configured.

I am trying to reproduce the issue using mokabench but no luck: #279 (comment)

CC: @ParkMyCar

The text was updated successfully, but these errors were encountered:

tatsuya6502 · 2023-06-20T00:41:26Z

Details of the issue: https://github.com/MaterializeInc/database-issues/issues/5879

It seems easy to reproduce if we can run an ~~integration~~ end-to-end test called "Long Zippy w/ user tables".

This happened twice in a row in long Zippy w/ user tables test: https://buildkite.com/materialize/release-qualification/builds/197#018883bf-5086-4a43-a8b4-ede1ef192de5

tatsuya6502 · 2023-06-20T00:51:50Z

https://github.com/MaterializeInc/materialize/blob/bfbebe04a7d181f1bac91f01785309a52cf2de34/ci/release-qualification/pipeline.yml#L99-L108

  - id: zippy-user-tables-large
    label: "Long Zippy w/ user tables"
    timeout_in_minutes: 2880
    agents:
      queue: linux-x86_64-large
    artifact_paths: junit_*.xml
    plugins:
      - ./ci/plugins/mzcompose:
          composition: zippy
          args: [--scenario=UserTablesLarge, --actions=200000]

CI Overview: https://github.com/MaterializeInc/materialize/blob/main/ci/README.md
About Zippy: https://github.com/MaterializeInc/materialize/blob/main/doc/developer/zippy.md

tatsuya6502 · 2023-06-21T16:12:04Z

I ran a short version of "Zippy w/ user tables test" on Linux x86_64. I used the same scenario, but smaller number of actions 2,000 instead of 200,000.

I could not reproduce the issue. I will try a longer run.

Environment:

Docker (x86_64)
Intel Core i7-12700F
- 8 × performance cores × 2 hyper threads
- 4 × efficient cores

Terminal Log:

$ git clone git@github.com:MaterializeInc/materialize.git
$ cd materialize

$ git branch moka-segfault 07386e427f
$ git checkout moka-segfault

$ rg moka
Cargo.lock
2903: "moka",
3241:name = "moka"
4380: "moka",
8629: "moka",

src/persist-client/Cargo.toml
41:moka = { version = "0.9.6", default-features = false, features = ["sync"] }

src/persist-client/src/internal/cache.rs
16:use moka::notification::RemovalCause;
17:use moka::sync::Cache;
100:        // This could maybe use moka's async cache to unify any concurrent

src/workspace-hack/Cargo.toml
63:moka = { version = "0.9.6", default-features = false, features = ["atomic64", "sync"] }
164:moka = { version = "0.9.6", default-features = false, features = ["atomic64", "sync"] }

$ ./bin/mzcompose --find zippy run default --scenario UserTablesLarge --actions 2000
==> Collecting mzbuild images
materialize/ubuntu-base:mzbuild-WBP7JFLWFQYUIHLXR5JSOAJ74VVKI6O7
materialize/clusterd:mzbuild-Q77JA6BIOATFUZAWANSIJFOCUXV3GOGX
materialize/materialized:mzbuild-IIUOBQEI7JVK3YSGYYV6FP3EP3JGFF5I
materialize/test-certs:mzbuild-6JWZ2MGLSYOCL3MLHIEIJMMLQLEOLCNY
materialize/postgres:mzbuild-ENTVWBFMO762DEEKAMURDMVMQ52GIO63
materialize/testdrive:mzbuild-ANAR5HEUWWDMTQUJ3ZA7XJWEY64H2ZTB
...
==> Running test case workflow-default
==> Running workflow default
...
[+] Running 3/3
 ✔ Container zippy-cockroach-1     Healthy
 ✔ Container zippy-minio-1         Healthy
 ✔ Container zippy-materialized-1  Healthy
> ALTER SYSTEM SET enable_unmanaged_cluster_replicas = true;
> CREATE CLUSTER storaged REPLICAS (r2 (
                STORAGECTL ADDRESSES ['storaged:2100'],
                STORAGE ADDRESSES ['storaged:2103'],
                COMPUTECTL ADDRESSES ['storaged:2101'],
                COMPUTE ADDRESSES ['storaged:2102'],
                WORKERS 4
            ))
[+] Stopping 1/1
 ✔ Container zippy-materialized-1  Stopped
Going to remove zippy-materialized-1
[+] Removing 1/0
 ✔ Container zippy-materialized-1  Removed
[+] Building 0.0s (0/0)
[+] Running 1/1
 ✔ Container zippy-testdrive-1  Healthy
Generating test...
Running test...
--- MaterializeInc/database-issues#1: KafkaStart
[+] Building 0.0s (0/0)
[+] Running 2/2
 ✔ Container zippy-zookeeper-1  Healthy
 ✔ Container zippy-kafka-1      Healthy
--- MaterializeInc/database-issues#2: CockroachStart
[+] Building 0.0s (0/0)
[+] Running 1/1
 ✔ Container zippy-cockroach-1  Healthy
--- MaterializeInc/database-issues#3: MinioStart
[+] Building 0.0s (0/0)
[+] Running 1/1
 ✔ Container zippy-minio-1  Healthy
--- MaterializeInc/database-issues#4: MzStart
[+] Building 0.0s (0/0)
[+] Running 3/3
 ✔ Container zippy-cockroach-1     Healthy
 ✔ Container zippy-minio-1         Healthy
 ✔ Container zippy-materialized-1  Healthy
--- MaterializeInc/materialize#5: StoragedStart
[+] Building 0.0s (0/0)
[+] Running 1/1
 ✔ Container zippy-storaged-1  Healthy
--- MaterializeInc/database-issues#5: CreateTable
> CREATE TABLE table_0 (f1 INTEGER);
rows match; continuing at ts 1687305573.2167296
> INSERT INTO table_0 VALUES (0);
rows match; continuing at ts 1687305573.2269592

...

> INSERT INTO table_0 SELECT * FROM generate_series(568924, 572569);
rows match; continuing at ts 1687306589.1175394
==> mzcompose: test case workflow-default succeeded

tatsuya6502 · 2023-06-21T16:27:14Z

There are almost 300 commits between when moka was added and when removed:
MaterializeInc/materialize@1bcd1c4...07386e4

I need to know when (which commit) last failed.

tatsuya6502 · 2023-06-22T22:30:24Z

There are almost 300 commits between when moka was added and when removed:
MaterializeInc/materialize@1bcd1c4...07386e4

I need to know when (which commit) last failed.

The following commits were mentioned in https://github.com/MaterializeInc/database-issues/issues/5879.

MaterializeInc/materialize@9ab5f83 (Jun 4, 2023)
- "Long Zippy w/ user tables" failed.
MaterializeInc/materialize@3deb5a4 (Jun 13, 2023)
- "testdrive" failed.
- BuildKite log

tatsuya6502 · 2023-06-24T09:49:10Z

MaterializeInc/materialize@9ab5f83 (Jun 4, 2023)

"Long Zippy w/ user tables" failed.

Ran the test (with 5,000 actions) several times on my home PC, but I have not reproduced the issue.

Environment:

Docker (x86_64)

Intel Core i7-12700F

8 × performance cores × 2 hyper threads

4 × efficient cores

My PC (Windows 11 + WSL2) does not have enough RAM to run longer Zippy test. It has only 16GB of RAM and Docker containers like Kafka sometimes get killed while running test. I will expand the RAM to 32GB.

I hope running longer test would help to reproduce the issue. But another aspect could be the number of available cores/hyper threads. My PC has 16 cores and 20 threads, but their EC2 instance only has 8 vCPUs:

c5.2xlarge instances, which have 8 vCPUs (i.e., hyperthreads) and 16GB of memory.

I will probably adjust the number of hyper threads available for Docker:

https://docs.docker.com/desktop/settings/windows/
- In WSL 2 mode, you can configure limits on the memory, CPU, and swap size allocated to the WSL 2 utility VM.

Also, I will add some tracing logs to both Moka and Materialize to get better understandings on the access pattern.

0xdeafbeef · 2023-06-24T11:06:22Z

@tatsuya6502 I have 32 threads / 128 ram and can leave test running for night if this can help

tatsuya6502 · 2023-06-25T00:30:51Z

@0xdeafbeef Thanks for the offer. Wow, you have a great machine!

I think I am going to be okay. I already ordered the RAM modules and am waiting for them to arrive. I think 32GB should be enough for now.

tatsuya6502 · 2023-06-25T12:20:48Z

FYI, I got the RAM modules for my PC, and confirmed it can run the test with at least 50,000 actions. I allocated 24GiB of memory and only 8 logical CPU cores to WSL2.

The reason I did not try 200,000 actions is that it will take long time (Estimated ~30 hours). 50,000 actions took 7 hours and 14 minutes. The memory utilization was fine so I had feeling that it could run 200,000 without issue.

Also, before I got the RAM module, I was also able to run 80,000 actions. I found that reducing the number of logical CPU cores for WSL2/Docker will reduce the memory utilization of the test. So I allocated only 8 cores (instead of 20) and 10GiB of memory (instead of 8GiB) to WSL2 and it was able to run 80,000 actions.

tatsuya6502 · 2023-06-25T22:47:45Z

I talked to a Materialize engineer. They will try to reproduce the issue using an EC2 instance and give me the full set of instructions.

MaterializeInc/materialize#20136 (comment)

Hopefully someone from the QA team has time to attempt the repro early next week and give you the full set of instructions.

tatsuya6502 · 2023-07-02T08:27:33Z

With some helps from Materialize engineers, I made some progress. I reproduced the segmentation fault in Moka v0.9.6 in Materialize, and got a core dump to analyze.

NOTES: I had to use an Amazon EC2 c6a.12xlarge instance, which has 48 × vCPUs. This is because using my home PC (20 × HyperThreads) can take few days to reproduce with the current test (details). Also it was a bit tricky to get core dump in a Docker container, so it took some time to me to figure out. (An modified Materialize test harness for getting core dumps)

Here is the back trace of the crashed thread:

It crashed in the Deque::move_to_back method (src/common/deque.rs#L202), which was called by the base_cache::Inner::handle_upsert method (src/sync_base/base_cache.rs#L1414).

Materialize uses Moka v0.9.6, and this problem may have been fixed by #259 in v0.11.0.

ci: fix doc deployment (again) MaterializeInc/materialize#259

I will back-port the fix to v0.9.x, and run the same test to see if the problem goes away.

Idea of steps to reproduce

If this is related to #259, reproducing steps could be the followings. However the end result depends largely on how the compiler optimizes reading an immutable reference & to a LRU double-ended queue (deque).

Create a cache with weighted capacity 6.
Stop the periodic sync.
Get not-existing key X once. (popularity = 1)
Get not-existing key Y once. (popularity = 1)
Insert key X with weight 2.
Insert key Y with weight 2.
Call sync:
- All X and Y are admitted.
- Now the LRU deque has: (front) X → Y (back)
Get not-existing key V three times. (popularity = 3)
Insert V with weight 3.
Invalidate Y.
- This removes Y from the concurrent hash table (cht), but removing it’s node from the deque will be delayed to a future sync.
Call sync. The followings will happen and the last one can cause a segmentation fault:
- X and Y are selected as the victims (entries to evict), and V is marked as admitted.
  - weights: X + Y = 4, V = 3
  - popularity: X + Y = 2, V = 3
- X is removed from the cht, and its deque node is unlinked and dropped.
  - Unlinking X node changes Y node's prev pointer from Some(NonNull(X)) to None .
- Because of step 10, Y is no longer in the cht, so Y's deque node is added to the skipped_nodes vector.
- V's deque node is created and pushed to the back of the deque.
- The nodes in the skipped_nodes are processed. (There is only Y node): Y node will be moved to the back of the deque.
  - Deque's move_to_back method checks Y's prev pointer, which is None now.
  - However, because of the bug (fixed by #259 for v0.11.0), optimized machine instructions may use an stale value of the prev pointer in a CPU register (pointing to X), instead of reloading the actual value None from the memory.
  - X was already dropped, so trying to access X (in the following code: *prev.as_ptr()) will cause the segmentation fault.

src/common/deque.rs#L202

188: pub(crate) unsafe fn move_to_back(&mut self, mut node: NonNull<DeqNode<T>>) {
        ...
201:    match node.prev {
202:        Some(prev) if node.next.is_some() => (*prev.as_ptr()).next = node.next,

tatsuya6502 · 2023-07-02T09:16:12Z

Invalidate Y.

This removes Y from the concurrent hash table (cht), but removing it’s node from the deque will be delayed to a future sync.

Call sync. The followings will happen and the last one can cause a segmentation fault:

X and Y are selected as the victims (entries to evict), and V is marked as admitted.

weights: X + Y = 4, V = 3

popularity: X + Y = 2, V = 3

X is removed from the cht, and its deque node is unlinked and dropped.

Unlinking X node changes Y node's prev pointer from Some(NonNull(X)) to None .

Because of step 10, Y is no longer in the cht, so Y's deque node is added to the skipped_nodes vector.

I reviewed code and realized that above will not reproduce the problem.

Step 10 (invalidate Y) needs to be done in a smaller time window.
It has to be done:
- after the first bullet of 11 (X and Y are selected as ...)
- and before the third bullet of 11 (Because of step 10, Y is no longer in the cht, ...).

tatsuya6502 · 2023-07-03T13:52:31Z

Materialize uses Moka v0.9.6, and this problem may have been fixed by #259 in v0.11.0.

ci: fix doc deployment (again) MaterializeInc/materialize#259

I will back-port the fix to v0.9.x, and run the same test to see if the problem goes away.

I backported the fix to v0.9.x:

build(deps): bump regex-syntax from 0.6.10 to 0.6.11 MaterializeInc/materialize#283

I am running "Long Zippy w/ user tables" on an EC2 c6a.12xlarge instance (48 vCPUs). It has been running for ~~13 hours~~ 14 hours without reproducing the problem. This looks promising.

I will release Moka v0.9.8 with MaterializeInc/materialize#283 after I finish running pre-release tests. And then I will let the Materialize engineers know that the problem may have been fixed.

tatsuya6502 · 2023-07-03T14:02:27Z

I will backport MaterializeInc/materialize#259 to Moka v0.10.x as well. I will keep this issue open until I am done with it.

tatsuya6502 · 2023-07-05T00:35:26Z

Closing as completed.

Moka v0.9.x

Backported #259 to Moka v0.9.8 via build(deps): bump regex-syntax from 0.6.10 to 0.6.11 MaterializeInc/materialize#283.
Published Moka v0.9.8 to crates.io.

Moka v0.10.x

Backported #259 to Moka v0.10.3 via ci/slt: fix uploading (again) (again) MaterializeInc/materialize#286.
Published Moka v0.10.3 to crates.io.

tatsuya6502 self-assigned this Jun 20, 2023

tatsuya6502 linked a pull request Jul 3, 2023 that will close this issue

Backport #259 to v0.9.8 #283

Merged

tatsuya6502 added the bug Something isn't working label Jul 3, 2023

tatsuya6502 added this to the v0.9.8 milestone Jul 3, 2023

tatsuya6502 mentioned this issue Jul 3, 2023

Fix the caches mutating a deque node through a NonNull pointer derived from a shared reference #259

Merged

tatsuya6502 closed this as completed Jul 5, 2023

This was referenced Jul 5, 2023

Data race found by miri test #279

Open

Please update the Cargo.lock for moka cache to v0.9.8 losfair/mvsqlite#117

Open

tatsuya6502 mentioned this issue Oct 1, 2023

Memory leak in moka 0.12 #329

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory corruption observed when using Moka v0.9.6 #281

Memory corruption observed when using Moka v0.9.6 #281

tatsuya6502 commented Jun 19, 2023 •

edited

Loading

tatsuya6502 commented Jun 20, 2023 •

edited

Loading

tatsuya6502 commented Jun 20, 2023 •

edited

Loading

tatsuya6502 commented Jun 21, 2023 •

edited

Loading

tatsuya6502 commented Jun 21, 2023 •

edited

Loading

tatsuya6502 commented Jun 22, 2023

tatsuya6502 commented Jun 24, 2023

0xdeafbeef commented Jun 24, 2023

tatsuya6502 commented Jun 25, 2023

tatsuya6502 commented Jun 25, 2023

tatsuya6502 commented Jun 25, 2023

tatsuya6502 commented Jul 2, 2023 •

edited

Loading

tatsuya6502 commented Jul 2, 2023 •

edited

Loading

tatsuya6502 commented Jul 3, 2023 •

edited

Loading

tatsuya6502 commented Jul 3, 2023

tatsuya6502 commented Jul 5, 2023

Memory corruption observed when using Moka v0.9.6 #281

Memory corruption observed when using Moka v0.9.6 #281

Comments

tatsuya6502 commented Jun 19, 2023 • edited Loading

tatsuya6502 commented Jun 20, 2023 • edited Loading

tatsuya6502 commented Jun 20, 2023 • edited Loading

tatsuya6502 commented Jun 21, 2023 • edited Loading

tatsuya6502 commented Jun 21, 2023 • edited Loading

tatsuya6502 commented Jun 22, 2023

tatsuya6502 commented Jun 24, 2023

0xdeafbeef commented Jun 24, 2023

tatsuya6502 commented Jun 25, 2023

tatsuya6502 commented Jun 25, 2023

tatsuya6502 commented Jun 25, 2023

tatsuya6502 commented Jul 2, 2023 • edited Loading

Idea of steps to reproduce

tatsuya6502 commented Jul 2, 2023 • edited Loading

tatsuya6502 commented Jul 3, 2023 • edited Loading

tatsuya6502 commented Jul 3, 2023

tatsuya6502 commented Jul 5, 2023

tatsuya6502 commented Jun 19, 2023 •

edited

Loading

tatsuya6502 commented Jun 20, 2023 •

edited

Loading

tatsuya6502 commented Jun 20, 2023 •

edited

Loading

tatsuya6502 commented Jun 21, 2023 •

edited

Loading

tatsuya6502 commented Jun 21, 2023 •

edited

Loading

tatsuya6502 commented Jul 2, 2023 •

edited

Loading

tatsuya6502 commented Jul 2, 2023 •

edited

Loading

tatsuya6502 commented Jul 3, 2023 •

edited

Loading