Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kamailio/opensips with keepalived,less than a minute in normal use,VIP is "useless" #2522

Closed
chongmin1 opened this issue Dec 30, 2024 · 10 comments

Comments

@chongmin1
Copy link

Describe the issue
register two SIP accounts with VIP,they can call and hang up normally.but after 30~40 seconds,they can not call,and register failed,too.

To Reproduce
master and backup configured the same kamailio/opensips & keepalived,startup both the kamailio and keepalived,register two softpthones with VIP.(MicroSIP and eyeBeam)
The two accounts can call and hang up normally,after 30-40 seconds,they can not call,and register failed,too.But if i shutdown master's keepalived,VIP go to the backup,they can register and call ,but after 30~40 seconds,register and call fail again,startup master's keepalived,still the same as before.That is, if the registration fails, change the VIP in the master and backup modes once, and i can register and make calls normally,unless kill all the keepalived.

Expected behavior
call and register should be normal

Keepalived version
Keepalived v2.3.1 (05/24,2024) (and v1.3.5 i have tried)

Copyright(C) 2001-2024 Alexandre Cassen, acassen@gmail.com

Built with kernel headers for Linux 3.10.0
Running on Linux 3.10.0-1160.el7.x86_64 #1 SMP Mon Oct 19 16:18:59 UTC 2020
Distro: CentOS Linux 7 (Core)

configure options: --prefix=/usr/local/keepalived --sysconfdir=/etc/ PKG_CONFIG_PATH=:/usr/local/lib/pkgconfig

Config options: NFTABLES LVS VRRP VRRP_AUTH VRRP_VMAC OLD_CHKSUM_COMPAT INIT=systemd SYSTEMD_NOTIFY

System options: VSYSLOG LIBNL3 RTA_ENCAP RTA_EXPIRES RTA_PREF FRA_SUPPRESS_PREFIXLEN FRA_TUN_ID RTAX_CC_ALGO RTAX_QUICKACK RTA_VIA IFA_FLAGS IPTABLES NET_LINUX_IF_H_COLLISION LIBIPVS_NETLINK IFLA_LINK_NETNSID GLOB_BRACE GLOB_ALTDIRFUNC INET6_ADDR_GEN_MODE SO_MARK

Distro (please complete the following information):

  • Name Centos
  • Version 7
  • Architecture x86_64

Details of any containerisation or hosted service (e.g. AWS)
null

Configuration file:
keepalived.conf

global_defs {
   script_user root
   enable_script_security
   smtp_connect_timeout 300
   router_id 192.168.xxx.xxx
}

vrrp_script chk_kamailio {
        script "/etc/keepalived/kamailio_check.sh" 
        interval 5 
        ## weight -20 
}

vrrp_instance VI_1 {
    state MASTER
    interface ens192
    virtual_router_id 51
    unicast_src_ip  192.168.xxx.xxx   
    unicast_peer {
        192.168.xxx.xxx      
    }
    dont_track_primary
    priority 100
    advert_int 1
    nopreempt
    authentication {
        auth_type PASS
        auth_pass 1234
    }

    track_script {
        chk_kamailio
    }

    virtual_ipaddress {
        ##192.168.12.104
        192.168.15.104/24 brd 192.168.15.104 dev ens192 label ens192:0

    }
}

kamailio_check.sh

time=$(date +'%b %d %H:%M:%S ')$(hostname -f)
status=$(ps -ef|grep kamailio | grep -v grep | grep -v bash | wc -l)

if [ $status -lt 19 ]; then
	echo "$time Warning kama is not running, trying to start it" >> /etc/keepalived/log/keepalived.log
    kamctl start
    status=$(ps -ef|grep kamailio | grep -v grep | grep -v bash | wc -l)
	time=$(date +'%b %d %H:%M:%S ')$(hostname -f)
    if [ $status -lt 19  ]; then
		echo "$time Error kamailio restart failed, end keepalived" >> /etc/keepalived/log/keepalived.log
		systemctl stop keepalived
	else
		echo "$time kamailio start success" >> /etc/keepalived/log/keepalived.log
    fi
fi

kamailio.conf/opensips.conf

listen=udp:192.168.xxx.xxx:8060
listen=udp:192.168.15.104:5060

ip addr
inet 192.168.xxx.xxx/24 brd 192.168.xxx.xxx scope global noprefixroute ens192
valid_lft forever preferred_lft forever
inet 192.168.15.104/24 brd 192.168.15.104 scope global secondary ens192:0
valid_lft forever preferred_lft forever

Notify and track scripts
null

System Log entries

Dec 27 17:20:04 localhost Keepalived[4744]: Starting Keepalived v2.3.1 (05/24,2024)
Dec 27 17:20:04 localhost Keepalived[4744]: Running on Linux 3.10.0-1160.119.1.el7.x86_64 #1 SMP Tue Jun 4 14:43:51 UTC 2024 (built for Linux 3.10.0)
Dec 27 17:20:04 localhost Keepalived[4744]: Command line: '/usr/local/keepalived/sbin/keepalived' '-D' '-S' '2'
Dec 27 17:20:04 localhost Keepalived[4744]: Opening file '/etc/keepalived/keepalived.conf'.
Dec 27 17:20:04 localhost Keepalived[4744]: Configuration file /etc/keepalived/keepalived.conf
Dec 27 17:20:04 localhost Keepalived[4745]: NOTICE: setting config option max_auto_priority should result in better keepalived performance
Dec 27 17:20:04 localhost Keepalived[4745]: Starting VRRP child process, pid=4746
Dec 27 17:20:04 localhost Keepalived_vrrp[4746]: Registering Kernel netlink reflector
Dec 27 17:20:04 localhost Keepalived_vrrp[4746]: Registering Kernel netlink command channel
Dec 27 17:20:04 localhost Keepalived_vrrp[4746]: (VI_1) Warning - nopreempt will not work with initial state MASTER - clearing
Dec 27 17:20:04 localhost Keepalived_vrrp[4746]: Assigned address 192.168.xxx.xxx for interface ens192
Dec 27 17:20:04 localhost Keepalived_vrrp[4746]: Registering gratuitous ARP shared channel
Dec 27 17:20:04 localhost Keepalived_vrrp[4746]: (VI_1) removing VIPs.
Dec 27 17:20:04 localhost Keepalived_vrrp[4746]: VRRP sockpool: [ifindex(  2), family(IPv4), proto(112), fd(12,13) , unicast, address(192.168.xxx.xxx)]
Dec 27 17:20:04 localhost Keepalived[4745]: Startup complete
Dec 27 17:20:04 localhost Keepalived_vrrp[4746]: VRRP_Script(chk_kamailio) succeeded
Dec 27 17:20:04 localhost Keepalived_vrrp[4746]: (VI_1) Entering BACKUP STATE
Dec 27 17:20:04 localhost Keepalived_vrrp[4746]: VI_1: sending gratuitous ARP for 192.168.xxx.xxx
Dec 27 17:20:04 localhost Keepalived_vrrp[4746]: Sending gratuitous ARP on ens192 for 192.168.xxx.xxx
Dec 27 17:20:06 localhost Keepalived_vrrp[4746]: (VI_1) received lower priority (90) advert from 192.168.xxx.xxx - discarding
Dec 27 17:20:07 localhost Keepalived_vrrp[4746]: (VI_1) received lower priority (90) advert from 192.168.xxx.xxx - discarding
Dec 27 17:20:08 localhost Keepalived_vrrp[4746]: (VI_1) Receive advertisement timeout
Dec 27 17:20:08 localhost Keepalived_vrrp[4746]: (VI_1) Entering MASTER STATE
Dec 27 17:20:08 localhost Keepalived_vrrp[4746]: (VI_1) setting VIPs.
Dec 27 17:20:08 localhost Keepalived_vrrp[4746]: (VI_1) Sending/queueing gratuitous ARPs on ens192 for 192.168.15.104
Dec 27 17:20:08 localhost Keepalived_vrrp[4746]: Sending gratuitous ARP on ens192 for 192.168.15.104
Dec 27 17:20:08 localhost Keepalived_vrrp[4746]: Sending gratuitous ARP on ens192 for 192.168.15.104
Dec 27 17:20:08 localhost Keepalived_vrrp[4746]: Sending gratuitous ARP on ens192 for 192.168.15.104
Dec 27 17:20:08 localhost Keepalived_vrrp[4746]: Sending gratuitous ARP on ens192 for 192.168.15.104
Dec 27 17:20:08 localhost Keepalived_vrrp[4746]: Sending gratuitous ARP on ens192 for 192.168.15.104
Dec 27 17:20:13 localhost Keepalived_vrrp[4746]: (VI_1) Sending/queueing gratuitous ARPs on ens192 for 192.168.15.104
Dec 27 17:20:13 localhost Keepalived_vrrp[4746]: Sending gratuitous ARP on ens192 for 192.168.15.104
Dec 27 17:20:13 localhost Keepalived_vrrp[4746]: Sending gratuitous ARP on ens192 for 192.168.15.104
Dec 27 17:20:13 localhost Keepalived_vrrp[4746]: Sending gratuitous ARP on ens192 for 192.168.15.104
Dec 27 17:20:13 localhost Keepalived_vrrp[4746]: Sending gratuitous ARP on ens192 for 192.168.15.104
Dec 27 17:20:13 localhost Keepalived_vrrp[4746]: Sending gratuitous ARP on ens192 for 192.168.15.104
Dec 27 17:20:36 localhost Keepalived[4745]: Stopping
Dec 27 17:20:36 localhost Keepalived_vrrp[4746]: (VI_1) sent 0 priority
Dec 27 17:20:36 localhost Keepalived_vrrp[4746]: (VI_1) removing VIPs.
Dec 27 17:20:37 localhost Keepalived_vrrp[4746]: Stopped - used (self/children) 0.002662/0.079051 user time, 0.005082/0.267156 system time
Dec 27 17:20:37 localhost Keepalived[4745]: CPU usage (self/children) user: 0.000904/0.081713 system: 0.000000/0.273118
Dec 27 17:20:37 localhost Keepalived[4745]: Stopped Keepalived v2.3.1 (05/24,2024)

Did keepalived coredump?
no at all

Additional context
no

@pqarmitage
Copy link
Collaborator

I suspect the problem is:

  virtual_ipaddress {
        192.168.15.104/24 brd 192.168.15.104 dev ens192 label ens192:0
    }

It doesn't make sense to configure the broadcast address to be the same as the address configured on the interface.

@chongmin1
Copy link
Author

@pqarmitage But before I used 192.168.15.104/24 brd 192.168.15.255 dev ens192 label ens192:0, and I just tried it, still the same problem

@pqarmitage
Copy link
Collaborator

Your log shows:
Dec 27 17:20:08 localhost Keepalived_vrrp[4746]: (VI_1) Entering MASTER STATE
and

Dec 27 17:20:36 localhost Keepalived[4745]: Stopping
Dec 27 17:20:36 localhost Keepalived_vrrp[4746]: (VI_1) sent 0 priority
Dec 27 17:20:36 localhost Keepalived_vrrp[4746]: (VI_1) removing VIPs.
Dec 27 17:20:37 localhost Keepalived_vrrp[4746]: Stopped - used (self/children) 0.002662/0.079051 user time, 0.005082/0.267156 system time

Stopping keepalived causes the 192.168.15.104/24 VIP to be removed, which would stop you SIP sessions working.

Why is keepalived being stopped? Is it because the kamailio_check.sh script is executing systemctl stop keepalived? keepalived doesn't just stop itself, it has to be instructed to do so.

@chongmin1
Copy link
Author

Your log shows: Dec 27 17:20:08 localhost Keepalived_vrrp[4746]: (VI_1) Entering MASTER STATE and

Dec 27 17:20:36 localhost Keepalived[4745]: Stopping
Dec 27 17:20:36 localhost Keepalived_vrrp[4746]: (VI_1) sent 0 priority
Dec 27 17:20:36 localhost Keepalived_vrrp[4746]: (VI_1) removing VIPs.
Dec 27 17:20:37 localhost Keepalived_vrrp[4746]: Stopped - used (self/children) 0.002662/0.079051 user time, 0.005082/0.267156 system time

Stopping keepalived causes the 192.168.15.104/24 VIP to be removed, which would stop you SIP sessions working.

Why is keepalived being stopped? Is it because the kamailio_check.sh script is executing systemctl stop keepalived? keepalived doesn't just stop itself, it has to be instructed to do so.

As I said before, start the master and backup. At first, the VIP on the master server can be used, but it will become invalid after a few tens of seconds. Then I actively disconnect the keepalived of the master, and the VIP is on the backup, which can be used for a few tens of seconds.

@chongmin1
Copy link
Author

Your log shows: Dec 27 17:20:08 localhost Keepalived_vrrp[4746]: (VI_1) Entering MASTER STATE and

Dec 27 17:20:36 localhost Keepalived[4745]: Stopping
Dec 27 17:20:36 localhost Keepalived_vrrp[4746]: (VI_1) sent 0 priority
Dec 27 17:20:36 localhost Keepalived_vrrp[4746]: (VI_1) removing VIPs.
Dec 27 17:20:37 localhost Keepalived_vrrp[4746]: Stopped - used (self/children) 0.002662/0.079051 user time, 0.005082/0.267156 system time

Stopping keepalived causes the 192.168.15.104/24 VIP to be removed, which would stop you SIP sessions working.

Why is keepalived being stopped? Is it because the kamailio_check.sh script is executing systemctl stop keepalived? keepalived doesn't just stop itself, it has to be instructed to do so.

keepalived and kamailio will not close automatically, and kamailio_check.sh will run normally. If I add a print in the script, it will be printed in the log all the time.

time=$(date +'%b %d %H:%M:%S ')$(hostname -f)
status=$(ps -ef|grep kamailio | grep -v grep | grep -v bash | wc -l)
if [ $status -lt 19 ]; then
        echo "$time Warning kama is not running, trying to start it" >> /etc/keepalived/log/keepalived.log
    kamctl start
    status=$(ps -ef|grep kamailio | grep -v grep | grep -v bash | wc -l)
        time=$(date +'%b %d %H:%M:%S ')$(hostname -f)
    if [ $status -lt 19  ]; then
                echo "$time Error kamailio restart failed, end keepalived" >> /etc/keepalived/log/keepalived.log
                systemctl stop keepalived
                kamctl stop
        else
                echo "$time kamailio start success" >> /etc/keepalived/log/keepalived.log
    fi
else
       echo "$time kamailio is running" >> /etc/keepalived/log/keepalived.log
fi

log

Dec 31 10:31:26 localhost kamailio is running
Dec 31 10:31:31 localhost kamailio is running
Dec 31 10:31:36 localhost kamailio is running
Dec 31 10:31:41 localhost kamailio is running
Dec 31 10:31:46 localhost kamailio is running
Dec 31 10:31:51 localhost kamailio is running
Dec 31 10:31:56 localhost kamailio is running

@pqarmitage
Copy link
Collaborator

All that keepalived does in respect of VIPs is it adds and deletes the addresses, and for performance reasons sends gratuitous ARP messages. keepalived does not handle any IP traffic in relation to the VIP.

Given the above, there is very little that keepalived is doing in respect of your scenario. So long as the VIP is configured on ens192 on the master system, and the backup system has not added the VIP, then whatever is happening is outside the control of keepalived.

I think you need to diagnose yourself why the SIP session stops working on your systems. For example you could manually add 192.168.15.104/24 brd + on one system with keepalived not running, and see what happens with SIP sessions. If that still has the problem, then you know the problem is not related to keepalived. If you don't get the problem when manually adding the address, then what is different when keepalived adds the VIP.

You probably need to try tracing what is happening with packets, using wireshark or tcpdump. When the SIP session stops working, where are the packets being dropped, or where are they being forwarded to. Perhaps there is an issue with some firewall configuration.

Unfortunately, with the information you have provided, there is nothing more that we can do to help identify the cause of your problem.

@chongmin1
Copy link
Author

All that keepalived does in respect of VIPs is it adds and deletes the addresses, and for performance reasons sends gratuitous ARP messages. keepalived does not handle any IP traffic in relation to the VIP.

Given the above, there is very little that keepalived is doing in respect of your scenario. So long as the VIP is configured on ens192 on the master system, and the backup system has not added the VIP, then whatever is happening is outside the control of keepalived.

I think you need to diagnose yourself why the SIP session stops working on your systems. For example you could manually add 192.168.15.104/24 brd + on one system with keepalived not running, and see what happens with SIP sessions. If that still has the problem, then you know the problem is not related to keepalived. If you don't get the problem when manually adding the address, then what is different when keepalived adds the VIP.

You probably need to try tracing what is happening with packets, using wireshark or tcpdump. When the SIP session stops working, where are the packets being dropped, or where are they being forwarded to. Perhaps there is an issue with some firewall configuration.

Unfortunately, with the information you have provided, there is nothing more that we can do to help identify the cause of your problem.

Ok,thank you for answering my question

@pqarmitage
Copy link
Collaborator

@chongmin1 It would be helpful for future reference if you could update this issue with the cause of and solution to the problem, once you have found it.

@chongmin1
Copy link
Author

@chongmin1 It would be helpful for future reference if you could update this issue with the cause of and solution to the problem, once you have found it.

Hello,i have sloved this problem.In my intranet environment, 192.168.15.104 is an existing host IP, so this situation occurs. When the local machine cannot ping an IP, this IP can be used as a VIP

@pqarmitage
Copy link
Collaborator

@chongmin1 It's good that you have resolved the problem, and thanks for the update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants