Skip to content
This repository has been archived by the owner on Mar 26, 2020. It is now read-only.

glustershd memory keeps increasing while creating PVCs #1467

Open
PrasadDesala opened this issue Jan 7, 2019 · 11 comments
Open

glustershd memory keeps increasing while creating PVCs #1467

PrasadDesala opened this issue Jan 7, 2019 · 11 comments

Comments

@PrasadDesala
Copy link

glusterfs memory increased from 74MB to 6.8G while creating 200 PVCs.

Before:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1150 root 20 0 3637200 74560 3320 S 0.0 0.2 0:01.52 glusterfs

After 200 PVCs are created:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1150 root 20 0 101.0g 6.8g 3388 S 94.1 21.6 17:43.07 glusterfs

Below are few other observations:

  1. For few of the volumes brick port is showing as -1
    Volume : pvc-9480160e-1279-11e9-a7a2-5254001ae311
    +--------------------------------------+-------------------------------+-----------------------------------------------------------------------------------------+--------+-------+------+
    | BRICK ID | HOST | PATH | ONLINE | PORT | PID |
    +--------------------------------------+-------------------------------+-----------------------------------------------------------------------------------------+--------+-------+------+
    | b7a95b9b-17da-4220-a38d-2d23eb75c83a | gluster-kube3-0.glusterd2.gcs | /var/run/glusterd2/bricks/pvc-9480160e-1279-11e9-a7a2-5254001ae311/subvol1/brick1/brick | true | 40635 | 3612 |
    | 133011b8-1825-4b6e-87e1-d7bed7332f55 | gluster-kube1-0.glusterd2.gcs | /var/run/glusterd2/bricks/pvc-9480160e-1279-11e9-a7a2-5254001ae311/subvol1/brick2/brick | true | -1 | 3041 |
    | ebfb7837-8657-46c9-aad9-449b6a1ba6bf | gluster-kube2-0.glusterd2.gcs | /var/run/glusterd2/bricks/pvc-9480160e-1279-11e9-a7a2-5254001ae311/subvol1/brick3/brick | true | 45864 | 3146 |
    +--------------------------------------+-------------------------------+-----------------------------------------------------------------------------------------+--------+-------+------+
  2. I am seeing below continuous messages in glustershd logs,
    [2019-01-07 13:14:14.157784] W [MSGID: 101012] [common-utils.c:3186:gf_get_reserved_ports] 36-glusterfs: could not open the file /proc/sys/net/ipv4/ip_local_reserved_ports for getting reserved ports info [No such file or directory]
    [2019-01-07 13:14:14.157840] W [MSGID: 101081] [common-utils.c:3226:gf_process_reserved_ports] 36-glusterfs: Not able to get reserved ports, hence there is a possibility that glusterfs may consume reserved port
    [2019-01-07 13:14:14.160159] W [MSGID: 101012] [common-utils.c:3186:gf_get_reserved_ports] 36-glusterfs: could not open the file /proc/sys/net/ipv4/ip_local_reserved_ports for getting reserved ports info [No such file or directory]
    [2019-01-07 13:14:14.160213] W [MSGID: 101081] [common-utils.c:3226:gf_process_reserved_ports] 36-glusterfs: Not able to get reserved ports, hence there is a possibility that glusterfs may consume reserved port
    [2019-01-07 13:14:14.183845] I [socket.c:811:__socket_shutdown] 36-pvc-93515db8-1279-11e9-a7a2-5254001ae311-replicate-0-client-1: intentional socket shutdown(7073)
    [2019-01-07 13:14:14.183946] E [MSGID: 101191] [event-epoll.c:759:event_dispatch_epoll_worker] 36-epoll: Failed to dispatch handler
  3. Below logs are continuously logged in glusterd2 logs,
    time="2019-01-07 13:15:28.484617" level=info msg="client connected" address="10.233.64.8:47178" server=sunrpc source="[server.go:148:sunrpc.(*SunRPC).acceptLoop]" transport=tcp
    time="2019-01-07 13:15:28.485340" level=error msg="registry.SearchByBrickPath() failed for brick" brick=/var/run/glusterd2/bricks/pvc-9480160e-1279-11e9-a7a2-5254001ae311/subvol1/brick2/brick error="SearchByBrickPath: port for brick /var/run/glusterd2/bricks/pvc-9480160e-1279-11e9-a7a2-5254001ae311/subvol1/brick2/brick not found" source="[rpc_prog.go:104:pmap.(*GfPortmap).PortByBrick]"

Observed behavior

glusterfs memory increased from 74MB to 6.8G after 200 PVCs are created. Also seeing above continuous messages getting logged.

Expected/desired behavior

glusterfs should not consume that much memory.

Details on how to reproduce (minimal and precise)

  1. Create a 3 node GCS setup using valgrind.
  2. Create 200 PVCs and keep on monitoring glusterfs resource consumption.

Information about the environment:

  • Glusterd2 version used (e.g. v4.1.0 or master): v6.0-dev.99.git0839909
  • Operating system used: CentOS 7.6
  • Glusterd2 compiled from sources, as a package (rpm/deb), or container:
  • Using External ETCD: (yes/no, if yes ETCD version): Yes, 3.3.8
  • If container, which container image:
  • Using kubernetes, openshift, or direct install:
  • If kubernetes/openshift, is gluster running inside kubernetes/openshift or outside: kubernetes
@PrasadDesala
Copy link
Author

Attaching glusterd2 dump, glusterd2 logs and glusterfs process state dump.

kube3-glusterd2.log.gz
kube2-glusterd2.log.gz
kube1-glusterd2.log.gz
glusterdump.1150.dump.1546865584.gz
statedump_kube-1.txt

@atinmu
Copy link
Contributor

atinmu commented Jan 7, 2019

@PrasadDesala I am assuming you meant glustershd is consuming high memory? Also did you enable brick multiplexing in the setup?

@PrasadDesala
Copy link
Author

@PrasadDesala I am assuming you meant glustershd is consuming high memory? Also did you enable brick multiplexing in the setup?

I think it is glustershd but I am not sure why glustershd is consuming memory as I am just creating PVCs so no healing should take place. I see the process name as glusterfs.

Brick-mux is not enabled on the setup.

@aravindavk
Copy link
Member

I think it is glustershd but I am not sure why glustershd is consuming memory as I am just creating PVCs so no healing should take place. I see the process name as glusterfs.

Yes, this is self heal process. Can be confirmed by checking cat /proc/<pid>/cmdline

@atinmu
Copy link
Contributor

atinmu commented Jan 8, 2019

@itisravi @karthik-us ^^ might be worth to check the same with GD1 based deployment. This isn't specific to GD2 problem as such.

@amarts
Copy link
Member

amarts commented Jan 8, 2019

I suspect this is due to https://review.gluster.org/#/c/glusterfs/+/21990/ also. Lets run a round of tests tomorrow as it is merged today.

@atinmu
Copy link
Contributor

atinmu commented Jan 16, 2019

On the latest master with multiple iterations we don't see memory consumption of glustershd process anything near to what has been reported and based on that I'm closing this for now. If we happen to hit this again, please feel free to reopen.

@atinmu atinmu closed this as completed Jan 16, 2019
@PrasadDesala
Copy link
Author

This issue is still seen on the last nightly build.
glustershd process memory increased from 8616 to 6.2g.

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
before: 395 root 20 0 514608 8616 3188 S 0.0 0.0 0:00.05 glusterfs
before_1: 395 root 20 0 95.3g 6.2g 3324 S 88.2 19.9 14:49.35 glusterfs

root@gluster-kube1-0 ~]# cat /proc/395/cmdline
/usr/sbin/glusterfs-sgluster-kube1-0.glusterd2.gcs--volfile-server-port24007--volfile-idgluster/glustershd-p/var/run/glusterd2/glustershd.pid-l/var/log/glusterd2/glusterfs/glustershd.log-S/var/run/glusterd2/shd-492ab606e75778b6.socket--xlator-optionreplicate.node-uuid=9842221d-97d1-4041-9d4c-51f6fc6ef191[root@gluster-kube1-0 ~]# ps -ef | grep -i glustershd
glusterd version: v6.0-dev.109.gitdfb2462

@PrasadDesala PrasadDesala changed the title glusterfs memory increased from 74MB to 6.8G while creating 200 PVCs glustershd memory keeps increasing while creating PVCs Jan 17, 2019
@amarts
Copy link
Member

amarts commented Jan 17, 2019

Can we disable shd for now in this setup, and re-enable when things settle down?

@atinmu
Copy link
Contributor

atinmu commented Jan 17, 2019

@PrasadDesala At this moment with every new PVCs we don't restart glustershd (which is a bug in GD2) and hence the overall memory consumption by the process remains static irrespective of how many PVCs we create and this is what is reflecting in my test setup too. So I'd definitely like to take a look at the setup where you are able to reproduce this.

@PrasadDesala
Copy link
Author

@atinmu This issue is closed and I don't have the perms to reopen it. If you have the access can you please reopen this issue.

@atinmu atinmu reopened this Jan 21, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants