Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: etcd start failed: max entry size limit exceeded #26855

Closed
1 task done
darkerin opened this issue Sep 5, 2023 · 12 comments
Closed
1 task done

[Bug]: etcd start failed: max entry size limit exceeded #26855

darkerin opened this issue Sep 5, 2023 · 12 comments
Assignees
Labels
kind/bug Issues or changes related a bug stale indicates no udpates for 30 days triage/needs-information Indicates an issue needs more information in order to work on it.

Comments

@darkerin
Copy link

darkerin commented Sep 5, 2023

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version: v2.3.0
- Deployment mode(standalone or cluster): standalone
- MQ type(rocksmq, pulsar or kafka):    rocksmq
- SDK version(e.g. pymilvus v2.0.0rc2):
- OS(Ubuntu or CentOS): Ubuntu18.04
- CPU/Memory:  4C32G
- GPU: 
- Others:

Current Behavior

etcd container start faied and output error:

{"level":"info","ts":"2023-09-05T11:05:51.158Z","caller":"embed/etcd.go:371","msg":"closing etcd server","name":"default","data-dir":"/etcd","advertise-peer-urls":["http://localhost:2380"],"advertise-client-urls":["http://127.0.0.1:2379"]}
{"level":"info","ts":"2023-09-05T11:05:51.158Z","caller":"embed/etcd.go:373","msg":"closed etcd server","name":"default","data-dir":"/etcd","advertise-peer-urls":["http://localhost:2380"],"advertise-client-urls":["http://127.0.0.1:2379"]}
{"level":"fatal","ts":"2023-09-05T11:05:51.158Z","caller":"etcdmain/etcd.go:204","msg":"discovery failed","error":"wal: max entry size limit exceeded, recBytes: 256, fileSize(64000000) - offset(63999936) - padBytes(0) = entryLimit(64)","stacktrace":"go.etcd.io/etcd/server/v3/etcdmain.startEtcdOrProxyV2\n\t/tmp/etcd-release-3.5.5/etcd/release/etcd/server/etcdmain/etcd.go:204\ngo.etcd.io/etcd/server/v3/etcdmain.Main\n\t/tmp/etcd-release-3.5.5/etcd/release/etcd/server/etcdmain/main.go:40\nmain.main\n\t/tmp/etcd-release-3.5.5/etcd/release/etcd/server/main.go:32\nruntime.main\n\t/usr/local/google/home/siarkowicz/.gvm/gos/go1.16.15/src/runtime/proc.go:225"}

Expected Behavior

No response

Steps To Reproduce

No response

Milvus Log

milvus.log:
milvus.log

etcd
etcd.log

Anything else?

No response

@darkerin darkerin added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Sep 5, 2023
@xiaofan-luan
Copy link
Contributor

seems that you need to bring up your etcd cluster first.

From etcd-io/etcd#14025, this is bug before etcd 3.5.5

But milvus should use etcd 3.5.5

are you sharing the etcd with other service?

@xiaofan-luan
Copy link
Contributor

And I don't think milvus could generate a 64M wal entry?

@yanliang567
Copy link
Contributor

@darkerin as comments above, please share more info about how you deploy milvus and any updates for configurations.
/assign @darkerin
/unassign

@sre-ci-robot sre-ci-robot assigned darkerin and unassigned yanliang567 Sep 6, 2023
@yanliang567 yanliang567 added triage/needs-information Indicates an issue needs more information in order to work on it. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Sep 6, 2023
@darkerin
Copy link
Author

darkerin commented Sep 6, 2023

@xiaofan-luan @yanliang567

etcd 3.5.5 container is only provided for milvus use.

Before this i try to create 500 partitions in a collection, and modified the etcd parameter ETCD_MAX_TXN_OPS
see #26748.

When I shut down the container and tried to start it again it failed

@xiaofan-luan
Copy link
Contributor

/assign @yanliang567
could you try to reproduce this issue on 2.3.x with 4096 partitions?

@xiaofan-luan
Copy link
Contributor

let us try to reproduce this issue

@darkerin
Copy link
Author

darkerin commented Sep 8, 2023

let us try to reproduce this issue

thanks a lot

@yanliang567
Copy link
Contributor

i did not reproduce this on latest 2.3.0 image.

@stale
Copy link

stale bot commented Oct 14, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen.

@stale stale bot added the stale indicates no udpates for 30 days label Oct 14, 2023
@stale stale bot closed this as completed Oct 22, 2023
@yhmo
Copy link
Contributor

yhmo commented Apr 18, 2024

Another user encountered this error with standalone(embedded etcd).
etcd

There is another issue in etcd says this bug is fixed in etcd v3.5.7
etcd-io/etcd#15090

@tianshihan818
Copy link

Another user encountered this error with standalone(embedded etcd). etcd

There is another issue in etcd says this bug is fixed in etcd v3.5.7 etcd-io/etcd#15090

Hi! Did milvus support to deploy etcd v3.5.7 now?

@xiaofan-luan
Copy link
Contributor

@LoveEachDay

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug stale indicates no udpates for 30 days triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
None yet
Development

No branches or pull requests

5 participants