Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Beta branch - feedback on use #686

Closed
pqarmitage opened this issue Oct 29, 2017 · 12 comments
Closed

Beta branch - feedback on use #686

pqarmitage opened this issue Oct 29, 2017 · 12 comments
Assignees

Comments

@pqarmitage
Copy link
Collaborator

Please add comments to this issue if you have successfully used the new beta branch of keepalived. In particular it would be helpful to describe the way that you are using keepalived, what platform you are running on, any benefits you have found with the new code and any suggested enhancements.

If you have found any issues with the beta code, please raise a separate issue report, add the beta label and include the following in the issue report:

  1. Indicate that the issue was found with the beta branch code (add the beta label to the issue)
  2. The output of keepalived -v
  3. A copy of your configuration, anonymised if needed
  4. What Linux distribution you are running on
  5. Whether you are running keepalived in a VM or container, with details of those as appropriate.
  6. A detailed description of the problem.
@pqarmitage pqarmitage self-assigned this Oct 29, 2017
@pqarmitage pqarmitage changed the title Beta branch - reports of use Beta branch - feedback on use Oct 29, 2017
@fenice2
Copy link

fenice2 commented Nov 1, 2017

I've just built a copy of this beta version with "rpmbuild" and currently running it on my centos7 VM servers, without problems so far.

I'm also testing building packages under mock and the keepalived failed to build in this environment. I've listed below the tail end of the log file, is there anything else that's need from me and is this a supported build environment?

Keepalived configuration

Keepalived version : 1.3.9
Compiler : gcc
Preprocessor flags : -I/lib/modules/3.10.0-693.5.2.el7/build/include -I/usr/include/libnl3
Compiler flags : -Wall -Wunused -Wstrict-prototypes -Wextra -g -O2
Linker flags :
Extra Lib : -lcrypto -lssl -lnl-genl-3 -lnl-3 -lip4tc -lip6tc -lxtables -ldl
Use IPVS Framework : Yes
IPVS use libnl : Yes
IPVS syncd attributes : No
IPVS 64 bit stats : No
fwmark socket support : Yes
Use VRRP Framework : Yes
Use VRRP VMAC : Yes
Use VRRP authentication : Yes
With ip rules/routes : Yes
SNMP vrrp support : No
SNMP checker support : No
SNMP RFCv2 support : No
SNMP RFCv3 support : No
DBUS support : No
SHA1 support : No
Use Debug flags : No
Use Json output : No
Stacktrace support : No
Memory alloc check : No
libnl version : 3
Use IPv4 devconf : No
Use libiptc : Yes
Use libipset : Yes
init type : undetected
Build genhash : Yes
Build documentation : No
Making all in lib

  • /usr/bin/make STRIP=/bin/true
    make[1]: Entering directory /builddir/build/BUILD/keepalived-1.3.9/lib' /usr/bin/make all-am make[2]: Entering directory /builddir/build/BUILD/keepalived-1.3.9/lib'
    CC memory.o
    CC utils.o
    CC notify.o
    CC timer.o
    CC scheduler.o
    CC vector.o
    CC list.o
    CC html.o
    CC parser.o
    parser.c: In function 'read_conf_file':
    parser.c:444:10: warning: ignoring return value of 'fchdir', declared with attribute warn_unused_result [-Wunused-result]
    fchdir(curdir_fd);
    ^
    CC signals.o
    CC logger.o
    CC rttables.o
    AR liblib.a
    make[2]: Leaving directory /builddir/build/BUILD/keepalived-1.3.9/lib' make[1]: Leaving directory /builddir/build/BUILD/keepalived-1.3.9/lib'
    Making all in keepalived
    make[1]: Entering directory /builddir/build/BUILD/keepalived-1.3.9/keepalived' Making all in core make[2]: Entering directory /builddir/build/BUILD/keepalived-1.3.9/keepalived/core'
    CC main.o
    CC daemon.o
    CC pidfile.o
    CC layer4.o
    CC smtp.o
    CC global_data.o
    CC global_parser.o
    CC process.o
    CC keepalived_netlink.o
    keepalived_netlink.c: In function 'netlink_request':
    keepalived_netlink.c:843:49: error: 'RTEXT_FILTER_SKIP_STATS' undeclared (first use in this function)
    addattr32(&req.nlh, sizeof req, IFLA_EXT_MASK, RTEXT_FILTER_SKIP_STATS);
    ^
    keepalived_netlink.c:843:49: note: each undeclared identifier is reported only once for each function it appears in
    make[2]: Leaving directory /builddir/build/BUILD/keepalived-1.3.9/keepalived/core' make[1]: Leaving directory /builddir/build/BUILD/keepalived-1.3.9/keepalived'
    RPM build errors:
    make[2]: *** [keepalived_netlink.o] Error 1
    make[1]: *** [all-recursive] Error 1
    make: *** [all-recursive] Error 1
    error: Bad exit status from /var/tmp/rpm-tmp.2KIkFi (%build)
    Bad exit status from /var/tmp/rpm-tmp.2KIkFi (%build)
    Child return code was: 1
    EXCEPTION: [Error()]
    Traceback (most recent call last):
    File "/usr/lib/python2.7/site-packages/mockbuild/trace_decorator.py", line 96, in trace
    result = func(*args, **kw)
    File "/usr/lib/python2.7/site-packages/mockbuild/util.py", line 598, in do
    raise exception.Error("Command failed: \n # %s\n%s" % (command, output), child.returncode)
    Error: Command failed:

/usr/bin/systemd-nspawn -q -M 87e1080a067146ae8e0cf7bc410e7d5a -D /var/lib/mock/epel-7-x86_64/root --private-network --setenv=LANG=en_GB.UTF-8 --setenv=TERM=vt100 --setenv=SHELL=/bin/bash --setenv=HOSTNAME=mock --setenv=PROMPT_COMMAND=printf "\033]0;\007" --setenv=HOME=/builddir --setenv=PATH=/usr/bin:/bin:/usr/sbin:/sbin --setenv=PS1= \s-\v$ -u mockbuild bash --login -c /usr/bin/rpmbuild -bb --target x86_64 --nodeps /builddir/build/SPECS/keepalived.spec

@pqarmitage
Copy link
Collaborator Author

I've just pushed an update to the beta branch that should fix your compilation issues. Unfortunately I hadn't included a check that the kernel headers supported RTEXT_FILTER_SKIP_STATS.

Many thanks for your feedback.

@fenice2
Copy link

fenice2 commented Nov 1, 2017

Thanks for the quick update. A build now seems to fail on a missing systemd service file:

RPM build errors:

  • exit 0
    File not found: /builddir/build/BUILDROOT/keepalived-1.3.9-1.x86_64/usr/lib/systemd/system/keepalived.service
    Child return code was: 1
    EXCEPTION: [Error()]
    Traceback (most recent call last):
    File "/usr/lib/python2.7/site-packages/mockbuild/trace_decorator.py", line 96, in trace
    result = func(*args, **kw)
    File "/usr/lib/python2.7/site-packages/mockbuild/util.py", line 598, in do
    raise exception.Error("Command failed: \n # %s\n%s" % (command, output), child.returncode)
    Error: Command failed:
    /usr/bin/systemd-nspawn -q -M 9d640fb2a6d74fe9a7a89d14fe5f41c9 -D /var/lib/mock/epel-7-x86_64/root --private-network --setenv=LANG=en_GB.UTF-8 --setenv=TERM=vt100 --setenv=SHELL=/bin/bash --setenv=HOSTNAME=mock --setenv=PROMPT_COMMAND=printf "\033]0;\007" --setenv=HOME=/builddir --setenv=PATH=/usr/bin:/bin:/usr/sbin:/sbin --setenv=PS1= \s-\v$ -u mockbuild bash --login -c /usr/bin/rpmbuild -bb --target x86_64 --nodeps /builddir/build/SPECS/keepalived.spec

@pqarmitage
Copy link
Collaborator Author

Unfortunately I don't have experience of mock builds. Doing an rpm build outside a mock environment completes successfully for me.

The only thing I can think of is that maybe the lines

if INIT_SYSTEMD
systemdsystemunit_DATA  = keepalived.service
endif

in keepalived/Makefile.am have a problem.

Could you attach the full log of the mock build so we can try and see where the keepalived.service file is being installed.

@fenice2
Copy link

fenice2 commented Nov 1, 2017

Same for me, the normal rpmbuild works fine and only the mock build is failing. I've attached the build log:
build.log

This is a CentOS7 VM running on ESXi, it has 2GB Ram and a non-standard (updated) kernel:
Linux centosdev 4.13.9-1.el7.elrepo.x86_64 #1 SMP Sun Oct 22 10:02:34 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux

@pqarmitage
Copy link
Collaborator Author

@fenice2 I've found a problem with mock builds, that I've resolved at least on my system. I can now do a mock build on Fedora 26 with a Fedora 26 target. If I try doing a mock build on my Fedora 26 system with an EPEL7 target it fails, but that looks more like a mock problem.

I've pushed the fixes upstream so if you pull the beta branch I hope mock builds will be fixed for you.

@fenice2
Copy link

fenice2 commented Nov 2, 2017

Fantastic, that worked a treat - it's all built, installs and works. :)

As always, many thanks for your quick response and all your work on this project.

Regards

Bill

@lkarsten
Copy link

lkarsten commented May 7, 2018

Hi.

A bit of feedback: We've been running "Keepalived v2.0.0 (04/11,2018)" in production on ubuntu bionic for about three weeks now. No IPVS, just a single VRRP group with a single IPv4 VIP. No IPv6.

No issues seen so far.

Thanks for writing keepalived!

@rwgroenenberg
Copy link

Running in a configuration where some track_scripts are used to control the priority of a VRRP instance, I notice that when the priority of the Master falls below that of the Backup, there is no (immediate) re-election.

Most of the times I try this there is a state change (on Backup) after ignoring the received adverts with lower prio 3 times:

May 18 11:20:52 vm1 Keepalived_vrrp[14548]: vrrp_backup
May 18 11:20:52 vm1 Keepalived_vrrp[14548]: (VI_1) received lower priority (100) advert from 192.168.123.12 - discarding
May 18 11:20:53 vm1 Keepalived_vrrp[14548]: vrrp_backup
May 18 11:20:53 vm1 Keepalived_vrrp[14548]: (VI_1) received lower priority (100) advert from 192.168.123.12 - discarding
May 18 11:20:54 vm1 Keepalived_vrrp[14548]: vrrp_backup
May 18 11:20:54 vm1 Keepalived_vrrp[14548]: (VI_1) received lower priority (100) advert from 192.168.123.12 - discarding
May 18 11:20:54 vm1 Keepalived_vrrp[14548]: vrrp_goto_master
May 18 11:20:54 vm1 Keepalived_vrrp[14548]: vrrp_state_goto_master
May 18 11:20:54 vm1 Keepalived_vrrp[14548]: (VI_1) Entering MASTER STATE
May 18 11:20:54 vm1 Keepalived_vrrp[14548]: (VI_1) Sending SNMP notification

But I've also seen a few times that the Backup keeps ignoring the lower prio adverts for ever...

There is a difference to this respect in the code compared with master (i.e. 1.4.4) where the received lower priority does trigger a re-election:
[master] vrrp.c:1632

	} else {
		log_message(LOG_INFO, "VRRP_Instance(%s) forcing a new MASTER election" , vrrp->iname);
		vrrp->wantstate = VRRP_STATE_GOTO_MASTER;
		vrrp_send_adv(vrrp, vrrp->effective_priority);
#ifdef _WITH_SNMP_RFCV3_
		vrrp->stats->master_reason = VRRPV3_MASTER_REASON_PREEMPTED;
#endif

[beta] vrrp.c:1736

} else {
		/* !nopreempt and lower priority advert and any preempt delay timer has expired */
		log_message(LOG_INFO, "(%s) received lower priority (%d) advert from %s - discarding", vrrp->iname, hd->priority, inet_sockaddrtos(&vrrp->pkt_saddr));

		ignore_advert = true;

#ifdef _WITH_SNMP_RFCV3_
		vrrp->stats->next_master_reason = VRRPV3_MASTER_REASON_PREEMPTED;
#endif

		/* We still want to record the master's address for SNMP purposes */
		vrrp->master_saddr = vrrp->pkt_saddr;
	}

Is it intentional to postpone to force the change from Backup to Master upon receiving a lower priority adverts?

(Keepalived v2.0.0 (05/08,2018), git commit 1.4.3-19-g9b8680a+ on CentOS6.9)

@pqarmitage
Copy link
Collaborator Author

@rwgroenenberg The RFC for VRRPv3 (RFC 5798) states:

        (425) + If the Priority in the ADVERTISEMENT is zero, then:
            (430) * Set the Master_Down_Timer to Skew_Time
         (440) + else // priority non-zero
            (445) * If Preempt_Mode is False, or if the Priority in the
            ADVERTISEMENT is greater than or equal to the local
            Priority, then:
               (450) @ Set Master_Adver_Interval to Adver Interval
               contained in the ADVERTISEMENT
               (455) @ Recompute the Master_Down_Interval
               (460) @ Reset the Master_Down_Timer to
               Master_Down_Interval
            (465) * else // preempt was true or priority was less
               (470) @ Discard the ADVERTISEMENT
            (475) *endif // preempt test
         (480) +endif // was priority zero?

Note: The comment at (465) should state // preempt was true **and** priority was less

(465) and (470) describe what to do in the event of a lower priority advert being received - i.e. discard it, and this is how keepalived now behaves (sending an advert wasn't simply discarding the lower priority advert).

So far as I remember, one of the problems with the old code was that if it received a lower priority advert, it reset the received advert timer and consequently it never transitioned to master. I decided when fixing this to do in in accordance with the RFC.

@rwgroenenberg
Copy link

I see, well then I'll keep an eye out for the situation of remaining in Backup forever despite receiving lower prio adverts. I'm quite sure that occurred on a clean beta version, but I'm currently trying to get the concept of an advert_group in (like you suggested as alternative to my multi i/f on a single VRRP instance).

@pqarmitage
Copy link
Collaborator Author

Beta branch now merged into master. Any issues with what was the beta branch should be tested against the keepalived v2.0 code, and if they still exist please report a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants