Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

request help: Etcd node high cpu and memory leak #6124

Closed
anjia0532 opened this issue Jan 17, 2022 · 4 comments
Closed

request help: Etcd node high cpu and memory leak #6124

anjia0532 opened this issue Jan 17, 2022 · 4 comments
Labels
doc Documentation things

Comments

@anjia0532
Copy link
Contributor

Issue description

docker-compose.yaml

version: '2'
services:
  apisix-dashboard:
    image: apache/apisix-dashboard:2.8
    stdin_open: true
    volumes:
    - /tmp/apisix/logs:/usr/local/apisix/logs
    - /data/apisix/apisix-dashboard-conf/:/usr/local/apisix-dashboard/conf
  apisix:
    image: apache/apisix:2.9-alpine
    volumes:
    - /tmp/apisix/logs:/usr/local/apisix/logs
    - /data/apisix/apisix-conf/:/usr/local/apisix/conf
  etcd:
    image: bitnami/etcd:3.4.16-debian-10-r14
    environment:
      ALLOW_NONE_AUTHENTICATION: 'yes'
      ETCDCTL_API: '3'
      ETCD_DATA_DIR: /bitnami/etcd/data
      ETCD_LOG_LEVEL: info
    volumes:
    - /data/etcd/data:/bitnami/etcd/data

  discovery-syncer:
    image: anjia0532/discovery-syncer:v1.0.5
    command:
    - --config.file=discovery-syncer.yaml

apisix-config.yaml

#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
apisix:
  node_listen: 9080             # APISIX listening port
  enable_heartbeat: true
  enable_admin: true
  enable_admin_cors: true
  enable_debug: false
  enable_dev_mode: false          # Sets nginx worker_processes to 1 if set to true
  enable_reuseport: true          # Enable nginx SO_REUSEPORT switch if set to true.
  enable_ipv6: true
  # hide ...
  #

nginx_config:                     # config for render the template to genarate nginx.conf
  error_log: "/dev/stderr"
  error_log_level: "warn"         # warn,error
  worker_rlimit_nofile: 20480     # the number of files a worker process can open, should be larger than worker_connections
  event:
    worker_connections: 10620
  http:
    access_log: "/dev/stdout"
    keepalive_timeout: 60s         # timeout during which a keep-alive client connection will stay open on the server side.
    client_header_timeout: 60s     # timeout for reading client request header, then 408 (Request Time-out) error is returned to the client
    client_body_timeout: 60s       # timeout for reading client request body, then 408 (Request Time-out) error is returned to the client
    send_timeout: 10s              # timeout for transmitting a response to the client.then the connection is closed
    underscores_in_headers: "on"   # default enables the use of underscores in client request header fields
    real_ip_header: "X-Real-IP"    # http://nginx.org/en/docs/http/ngx_http_realip_module.html#real_ip_header
    real_ip_from:                  # http://nginx.org/en/docs/http/ngx_http_realip_module.html#set_real_ip_from
      - 127.0.0.1
      - 'unix:'
    #lua_shared_dicts:              # add custom shared cache to nginx.conf
    #  ipc_shared_dict: 100m        # custom shared cache, format: `cache-key: cache-size`

etcd:
  host:                                 # it's possible to define multiple etcd hosts addresses of the same etcd cluster.
    - "http://etcd:2379"
  prefix: "/apisix"     # apisix configurations prefix
  timeout: 30   # 30 seconds
plugins:                          # plugin list
  - api-breaker
  - authz-keycloak
  - basic-auth
  - batch-requests
  - consumer-restriction
  - cors
  - echo
  - fault-injection
  - grpc-transcode
  - hmac-auth
  - http-logger
  - ip-restriction
  - jwt-auth
  - kafka-logger
  - key-auth
  - limit-conn
  - limit-count
  - limit-req
  - node-status
  - openid-connect
  - prometheus
  - proxy-cache
  - proxy-mirror
  - proxy-rewrite
  - redirect
  - referer-restriction
  - request-id
  - request-validation
  - response-rewrite
  - serverless-post-function
  - serverless-pre-function
  - sls-logger
  - syslog
  - tcp-logger
  - udp-logger
  - uri-blocker
  - wolf-rbac
  - zipkin
  - traffic-split
stream_plugins:
  - mqtt-proxy

plugin_attr:
  prometheus:
    export_addr:
      ip: 0.0.0.0
      port: 9091

e126a17f9acfa7a861e1e9e31274e7e

discovery-syncer is a golang toolkit , sync nacos/eureka instances to apisix upstream (by apisix admin api)

In my case , apisix is small and test cluster,etcd only 1 nodes. routes+upstreams less than 20 . uptime more than 2 mouths.

image

cd23ffe29f2c1eeaa0d05e8454ff0fa

cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=16.04
DISTRIB_CODENAME=xenial
DISTRIB_DESCRIPTION="Ubuntu 16.04.6 LTS"

uname -a
Linux ecs-053 4.4.0-184-generic #214-Ubuntu SMP Thu Jun 4 10:14:11 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

Environment

  • apisix version (cmd: apisix version): 2.9
  • OS (cmd: uname -a): Linux ae648eadf0c2 4.4.0-184-generic win10+WSL+ubuntu18.04环境搭建 #214-Ubuntu SMP Thu Jun 4 10:14:11 UTC 2020 x86_64 Linux
  • OpenResty / Nginx version (cmd: nginx -V or openresty -V):
openresty -V
nginx version: openresty/1.19.3.1
built by gcc 10.2.1 20201203 (Alpine 10.2.1_pre1)
built with OpenSSL 1.1.1k  25 Mar 2021
TLS SNI support enabled
configure arguments: --prefix=/usr/local/openresty/nginx --with-cc-opt='-O2 -DNGX_LUA_ABORT_AT_PANIC -I/usr/local/openresty/pcre/include -I/usr/local/openresty/openssl/include' --add-module=../ngx_devel_kit-0.3.1 --add-module=../echo-nginx-module-0.62 --add-module=../xss-nginx-module-0.06 --add-module=../ngx_coolkit-0.2 --add-module=../set-misc-nginx-module-0.32 --add-module=../form-input-nginx-module-0.12 --add-module=../encrypted-session-nginx-module-0.08 --add-module=../srcache-nginx-module-0.32 --add-module=../ngx_lua-0.10.19 --add-module=../ngx_lua_upstream-0.07 --add-module=../headers-more-nginx-module-0.33 --add-module=../array-var-nginx-module-0.05 --add-module=../memc-nginx-module-0.19 --add-module=../redis2-nginx-module-0.15 --add-module=../redis-nginx-module-0.3.7 --add-module=../rds-json-nginx-module-0.15 --add-module=../rds-csv-nginx-module-0.09 --add-module=../ngx_stream_lua-0.0.9 --with-ld-opt='-Wl,-rpath,/usr/local/openresty/luajit/lib -L/usr/local/openresty/pcre/lib -L/usr/local/openresty/openssl/lib -Wl,-rpath,/usr/local/openresty/pcre/lib:/usr/local/openresty/openssl/lib' --with-pcre --with-compat --with-file-aio --with-http_addition_module --with-http_auth_request_module --with-http_dav_module --with-http_flv_module --with-http_geoip_module=dynamic --with-http_gunzip_module --with-http_gzip_static_module --with-http_image_filter_module=dynamic --with-http_mp4_module --with-http_random_index_module --with-http_realip_module --with-http_secure_link_module --with-http_slice_module --with-http_ssl_module --with-http_stub_status_module --with-http_sub_module --with-http_v2_module --with-http_xslt_module=dynamic --with-ipv6 --with-mail --with-mail_ssl_module --with-md5-asm --with-pcre-jit --with-sha1-asm --with-stream --with-stream_ssl_module --with-threads --with-stream --with-stream_ssl_preread_module
  • etcd version, if have (cmd: run curl http://127.0.0.1:9090/v1/server_info to get the info from server-info API):
curl http://127.0.0.1:9090/v1/server_info
<html>
<head><title>404 Not Found</title></head>
<body>
<center><h1>404 Not Found</h1></center>
<hr><center>openresty</center>
</body>
</html>
  • apisix-dashboard version, if have:
  • the plugin runner version, if the issue is about a plugin runner (cmd: depended on the kind of runner):
  • luarocks version, if the issue is about installation (cmd: luarocks --version): /usr/local/openresty/luajit/bin/luarocks 3.7.0
@anjia0532
Copy link
Contributor Author

#5723

@anjia0532
Copy link
Contributor Author

anjia0532 commented Jan 17, 2022

Thanks wechat group 孙冉-小电

image

Thanks wechat group For GG

Thanks for wechat group 罗泽轩-支流科技

image

ref blog etcd生产环境实践

# 获取当前版本号
$ rev=$(ETCDCTL_API=3 etcdctl --endpoints=:2379 endpoint status --write-out="json" | egrep -o '"revision":[0-9]*' | egrep -o '[0-9]*')
# 压缩所有旧版本
$ ETCDCTL_API=3 etcdctl compact $rev
# 去碎片化
$ ETCDCTL_API=3 etcdctl defrag
# 取消警报
$ ETCDCTL_API=3 etcdctl alarm disarm
# 测试通过
$ ETCDCTL_API=3 etcdctl put newkey 123
# 清理碎片
$ etcdctl defrag

ref blog ETCD磁盘空间爆满解决方案

ref etcd's doc Defragmentation
ref etcd's doc History compaction: v3 API Key-Value Database
ref blog golang pprof etcd 性能分析

@anjia0532
Copy link
Contributor Author

根据实际情况谨慎使用 自动压缩功能,以及选择合适时间的窗口,

比如 按照时间压缩 (3.3.0 之前只能通过时间压缩,3.3.0以上默认使用时间压缩)
etcd --auto-compaction-mode=periodic --auto-compaction-retention=12h

按照版本压缩(3.3.0以上版本)
etcd --auto-compaction-mode=revision --auto-compaction-retention=1000

注意仔细阅读官方文档 History compaction: v3 API Key-Value Database 结合实际的etcd版本和实际情况进行配置。

@juzhiyuan juzhiyuan added the doc Documentation things label Jan 18, 2022
@anjia0532
Copy link
Contributor Author

如果是用helm安装,可以通过--set命令配置etcd command
--set "etcd.enabled=true,etcd.command={'/opt/bitnami/etcd/bin/etcd','--auto-compaction-mode=periodic','--auto-compaction-retention=12h'}",就是每12小时自动压缩一次

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc Documentation things
Projects
None yet
Development

No branches or pull requests

2 participants