Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ETCD DB size metrics is not correct sometimes #8080

Closed
armstrongli opened this issue Jun 12, 2017 · 5 comments
Closed

ETCD DB size metrics is not correct sometimes #8080

armstrongli opened this issue Jun 12, 2017 · 5 comments

Comments

@armstrongli
Copy link

armstrongli commented Jun 12, 2017

We encountered this issue while the DB size is larger than 5GB when re-adding members back to the cluster.

The follow I have is:

  1. Start one ETCD cluster(3 members)
  2. Bump >5GB to the ETCD cluster
  3. Remove one member from the cluster
  4. Stop all members from the cluster
  5. Restart all members

The DB size reported by ETCD servers is 0.

After taking one snapshot, all the DB size is back to normal.

screen shot 2017-06-12 at 10 32 31 am

Here are the metrics I curl from one of the member

  • Before the snapshot
# HELP etcd_debugging_mvcc_db_total_size_in_bytes Total size of the underlying database in bytes.
# TYPE etcd_debugging_mvcc_db_total_size_in_bytes gauge
etcd_debugging_mvcc_db_total_size_in_bytes 0

metrics-before-snapshot.txt.zip

  • After the snapshot
# HELP etcd_debugging_mvcc_db_total_size_in_bytes Total size of the underlying database in bytes.
# TYPE etcd_debugging_mvcc_db_total_size_in_bytes gauge
etcd_debugging_mvcc_db_total_size_in_bytes 7.014617088e+09

metrics-after-snapshot.txt.zip

Cluster endpoint status

+-------------------------------------------------+------------------+---------+---------+-----------+-----------+------------+
|                    ENDPOINT                     |        ID        | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+-------------------------------------------------+------------------+---------+---------+-----------+-----------+------------+
| https://tess-node-p4hnm-1189140.33.tess.io:4001 | 416b952ae811a0d7 | 3.0.15  | 7.0 GB  | false     |       378 |    1670896 |
| https://tess-node-nf3nq-1189139.33.tess.io:4001 | c0fc36cf224bac7c | 3.0.15  | 7.0 GB  | false     |       378 |    1670896 |
| https://tess-node-phkwk-1189138.33.tess.io:4001 | f9d31f1cd85ee669 | 3.0.15  | 7.0 GB  | true      |       378 |    1670897 |
+-------------------------------------------------+------------------+---------+---------+-----------+-----------+------------+
@armstrongli
Copy link
Author

The endpoint status is right before & after taking snapshot.

@armstrongli armstrongli changed the title ETCD metrics is not correct sometimes ETCD DB size metrics is not correct sometimes Jun 12, 2017
@xiang90
Copy link
Contributor

xiang90 commented Jun 12, 2017

The database size is a debugging metrics and is calculated lazily. If there is no new data committed, it wont be updated.

@armstrongli
Copy link
Author

armstrongli commented Jun 12, 2017

It didn't get updated for days.
screen shot 2017-06-12 at 1 52 55 pm

@xiang90
Copy link
Contributor

xiang90 commented Jun 12, 2017

can you share me a script that i can run locally to reproduce it?

@armstrongli
Copy link
Author

Just need to create one cluster with v3.0.15 & v3.1.8 together, and run benchmark on it. And that's all.

heyitsanthony added a commit to heyitsanthony/etcd that referenced this issue Jun 16, 2017
gyuho pushed a commit that referenced this issue Jun 20, 2017
yudai pushed a commit to yudai/etcd that referenced this issue Oct 5, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants