Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PID file removed after reload #6082

Closed
tgrymatt opened this issue Feb 7, 2018 · 19 comments · Fixed by Icinga/deb-icinga2#3 or #6163
Closed

PID file removed after reload #6082

tgrymatt opened this issue Feb 7, 2018 · 19 comments · Fixed by Icinga/deb-icinga2#3 or #6163
Labels
area/setup Installation, systemd, sample files bug Something isn't working
Milestone

Comments

@tgrymatt
Copy link

tgrymatt commented Feb 7, 2018

After a reload of icinga2 there is no icinga2 process via /etc/init.d/icinga2 reload there is no icinga2 process anymore.

OS: Debian 9.3
Icinga Version:

icinga2                                   2.8.1+406.g407a2d052.2018.02.07+1.stretch-0 amd64
icinga2-bin                               2.8.1+406.g407a2d052.2018.02.07+1.stretch-0 amd64
icinga2-common                            2.8.1+406.g407a2d052.2018.02.07+1.stretch-0 all
icinga2-doc                               2.8.1+406.g407a2d052.2018.02.07+1.stretch-0 all
icinga2-ido-mysql                         2.8.1+406.g407a2d052.2018.02.07+1.stretch-0 amd64
libicinga2                                2.8.1+406.g407a2d052.2018.02.07+1.stretch-0 amd64

Debug Logfile output:

[2018-02-07 09:11:06 +0100] information/Application: Got reload command: Starting new instance.
[2018-02-07 09:11:06 +0100] notice/Process: Running command '/usr/lib/x86_64-linux-gnu/icinga2/sbin/icinga2' '--no-stack-rlimit' 'daemon' '-e' '/var/log/icinga2/error.log' '--reload-internal' '27264': PID 27446
[2018-02-07 09:11:07 +0100] debug/IdoMysqlConnection: Query: COMMIT
[2018-02-07 09:11:07 +0100] debug/IdoMysqlConnection: Query: BEGIN
[2018-02-07 09:11:07 +0100] information/Application: Reload requested, letting new process take over.

After that no output anymore and no icinga2 process is running anymore.

@dnsmichi
Copy link
Contributor

dnsmichi commented Feb 7, 2018

@Crunsher since you've change parts with the reload, any ideas?

@Crunsher
Copy link
Contributor

Crunsher commented Feb 7, 2018

Reload requested, letting new process take over.

This definitely came with my change. Do you run SELinux? It could block the SIGUSR2 signal, possibly.

@Crunsher Crunsher added the needs feedback We'll only proceed once we hear from you again label Feb 9, 2018
@tgrymatt
Copy link
Author

@Crunsher SELinux is currently disabled. In the current version 2.8.1+414.gbb96b7742.2018.02.10+1.stretch-0 this issue is still active.

@gunnarbeutner gunnarbeutner added the bug Something isn't working label Feb 13, 2018
@Crunsher
Copy link
Contributor

I was unable to reproduce this on Centos 6 with (don't have a debian with custom sysvinit lying about)

[root@localhost ~]# /etc/init.d/icinga2 start
Checking configuration: Done
Starting Icinga 2: Done
[root@localhost ~]# /etc/init.d/icinga2 status
Icinga 2 status: Running
[root@localhost ~]# /etc/init.d/icinga2 reload
Validating config files: Done
Reloading Icinga 2: Done
[root@localhost ~]# /etc/init.d/icinga2 status
Icinga 2 status: Running
[root@localhost ~]# ps ax | grep icinga2
  2631 ?        Ssl    0:00 /usr/lib64/icinga2/sbin/icinga2 --no-stack-rlimit daemon -c /etc/icinga2/icinga2.conf -d -e /var/log/icinga2/error.log --reload-internal 2349
  2655 ?        S      0:00 /usr/lib64/icinga2/sbin/icinga2 --no-stack-rlimit daemon -c /etc/icinga2/icinga2.conf -d -e /var/log/icinga2/error.log --reload-internal 2349
  2768 pts/0    S+     0:00 grep icinga2

Could you please run the reload with strace? strace /etc/init.d/icinga2 reload

@tgrymatt
Copy link
Author

tgrymatt commented Feb 15, 2018

strace /etc/init.d/icinga2 reload
execve("/etc/init.d/icinga2", ["/etc/init.d/icinga2", "reload"], [/* 24 vars */]) = 0
brk(NULL)                               = 0x563d33984000
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
mmap(NULL, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fab5c65a000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=65606, ...}) = 0
mmap(NULL, 65606, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fab5c649000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\320\3\2\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1689360, ...}) = 0
mmap(NULL, 3795360, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fab5c09b000
mprotect(0x7fab5c230000, 2097152, PROT_NONE) = 0
mmap(0x7fab5c430000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x195000) = 0x7fab5c430000
mmap(0x7fab5c436000, 14752, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fab5c436000
close(3)                                = 0
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fab5c647000
arch_prctl(ARCH_SET_FS, 0x7fab5c647700) = 0
mprotect(0x7fab5c430000, 16384, PROT_READ) = 0
mprotect(0x563d327f9000, 8192, PROT_READ) = 0
mprotect(0x7fab5c65d000, 4096, PROT_READ) = 0
munmap(0x7fab5c649000, 65606)           = 0
getpid()                                = 27306
rt_sigaction(SIGCHLD, {sa_handler=0x563d325efef0, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER, sa_restorer=0x7fab5c0ce030}, NULL, 8) = 0
geteuid()                               = 0
brk(NULL)                               = 0x563d33984000
brk(0x563d339a5000)                     = 0x563d339a5000
getppid()                               = 27304
stat("/root", {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0
stat(".", {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0
open("/etc/init.d/icinga2", O_RDONLY)   = 3
fcntl(3, F_DUPFD, 10)                   = 10
close(3)                                = 0
fcntl(10, F_SETFD, FD_CLOEXEC)          = 0
rt_sigaction(SIGINT, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGINT, {sa_handler=0x563d325efef0, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER, sa_restorer=0x7fab5c0ce030}, NULL, 8) = 0
rt_sigaction(SIGQUIT, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGQUIT, {sa_handler=SIG_DFL, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER, sa_restorer=0x7fab5c0ce030}, NULL, 8) = 0
rt_sigaction(SIGTERM, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGTERM, {sa_handler=SIG_DFL, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER, sa_restorer=0x7fab5c0ce030}, NULL, 8) = 0
read(10, "#! /bin/sh\n### BEGIN INIT INFO\n#"..., 8192) = 6513
rt_sigaction(SIGPIPE, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
rt_sigaction(SIGPIPE, {sa_handler=SIG_IGN, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER, sa_restorer=0x7fab5c0ce030}, NULL, 8) = 0
geteuid()                               = 0
stat("/usr/sbin/icinga2", {st_mode=S_IFREG|0755, st_size=840, ...}) = 0
faccessat(AT_FDCWD, "/usr/sbin/icinga2", X_OK) = 0
faccessat(AT_FDCWD, "/etc/default/icinga2", R_OK) = 0
open("/etc/default/icinga2", O_RDONLY)  = 3
fcntl(3, F_DUPFD, 10)                   = 11
close(3)                                = 0
fcntl(11, F_SETFD, FD_CLOEXEC)          = 0
read(11, "# default settings for icinga2's"..., 8192) = 92
read(11, "", 8192)                      = 0
close(11)                               = 0
open("/lib/init/vars.sh", O_RDONLY)     = 3
fcntl(3, F_DUPFD, 10)                   = 11
close(3)                                = 0
fcntl(11, F_SETFD, FD_CLOEXEC)          = 0
read(11, "#\n# Set rcS vars\n#\n\n# Because /e"..., 8192) = 1212
stat("/etc/default/rcS", {st_mode=S_IFREG|0644, st_size=821, ...}) = 0
open("/etc/default/rcS", O_RDONLY)      = 3
fcntl(3, F_DUPFD, 10)                   = 12
close(3)                                = 0
fcntl(12, F_SETFD, FD_CLOEXEC)          = 0
read(12, "################################"..., 8192) = 821
read(12, "", 8192)                      = 0
close(12)                               = 0
faccessat(AT_FDCWD, "/proc/cmdline", R_OK) = 0
pipe([3, 4])                            = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27307
close(4)                                = 0
read(3, "BOOT_IMAGE=/boot/vmlinuz-4.9.0-5"..., 128) = 95
read(3, "", 128)                        = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27307, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 0
close(3)                                = 0
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27307
read(11, "", 8192)                      = 0
close(11)                               = 0
open("/lib/lsb/init-functions", O_RDONLY) = 3
fcntl(3, F_DUPFD, 10)                   = 11
close(3)                                = 0
fcntl(11, F_SETFD, FD_CLOEXEC)          = 0
read(11, "# /lib/lsb/init-functions for De"..., 8192) = 8192
read(11, "# On Debian, would output \"Start"..., 8192) = 3318
pipe([3, 4])                            = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27308
close(4)                                = 0
read(3, "/lib/lsb/init-functions.d/20-lef"..., 128) = 83
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27308, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 83
read(3, "", 128)                        = 0
close(3)                                = 0
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27308
faccessat(AT_FDCWD, "/lib/lsb/init-functions.d/20-left-info-blocks", R_OK) = 0
open("/lib/lsb/init-functions.d/20-left-info-blocks", O_RDONLY) = 3
fcntl(3, F_DUPFD, 10)                   = 12
close(3)                                = 0
fcntl(12, F_SETFD, FD_CLOEXEC)          = 0
read(12, "# Default info blocks put to the"..., 8192) = 1088
read(12, "", 8192)                      = 0
close(12)                               = 0
faccessat(AT_FDCWD, "/lib/lsb/init-functions.d/40-systemd", R_OK) = 0
open("/lib/lsb/init-functions.d/40-systemd", O_RDONLY) = 3
fcntl(3, F_DUPFD, 10)                   = 12
close(3)                                = 0
fcntl(12, F_SETFD, FD_CLOEXEC)          = 0
read(12, "# -*-Shell-script-*-\n# /lib/lsb/"..., 8192) = 2942
stat("/run/systemd/system", {st_mode=S_IFDIR|0755, st_size=40, ...}) = 0
pipe([3, 4])                            = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27309
close(4)                                = 0
read(3, "loaded\n", 128)                = 7
read(3, "", 128)                        = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27309, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 0
close(3)                                = 0
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27309
pipe([3, 4])                            = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27310
close(4)                                = 0
read(3, "/etc/init.d/icinga2\n", 128)   = 20
read(3, "", 128)                        = 0
close(3)                                = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27310, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 0
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27310
pipe([3, 4])                            = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27311
close(4)                                = 0
read(3, "yes\n", 128)                   = 4
read(3, "", 128)                        = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27311, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 0
close(3)                                = 0
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27311
pipe([3, 4])                            = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27312
close(4)                                = 0
read(3, "degraded\n", 128)              = 9
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27312, si_uid=0, si_status=1, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 9
read(3, "", 128)                        = 0
close(3)                                = 0
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 1}], 0, NULL) = 27312
ioctl(1, TCGETS, {B38400 opost isig icanon echo ...}) = 0
geteuid()                               = 0
stat("/usr/bin/tput", {st_mode=S_IFREG|0755, st_size=18456, ...}) = 0
faccessat(AT_FDCWD, "/usr/bin/tput", X_OK) = 0
geteuid()                               = 0
stat("/usr/bin/expr", {st_mode=S_IFREG|0755, st_size=43848, ...}) = 0
faccessat(AT_FDCWD, "/usr/bin/expr", X_OK) = 0
open("/dev/null", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
fcntl(1, F_DUPFD, 10)                   = 13
close(1)                                = 0
fcntl(13, F_SETFD, FD_CLOEXEC)          = 0
dup2(3, 1)                              = 1
close(3)                                = 0
fcntl(2, F_DUPFD, 10)                   = 14
close(2)                                = 0
fcntl(14, F_SETFD, FD_CLOEXEC)          = 0
dup2(1, 2)                              = 2
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27313
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27313
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27313, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 27313
dup2(13, 1)                             = 1
close(13)                               = 0
dup2(14, 2)                             = 2
close(14)                               = 0
open("/dev/null", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
fcntl(1, F_DUPFD, 10)                   = 13
close(1)                                = 0
fcntl(13, F_SETFD, FD_CLOEXEC)          = 0
dup2(3, 1)                              = 1
close(3)                                = 0
fcntl(2, F_DUPFD, 10)                   = 14
close(2)                                = 0
fcntl(14, F_SETFD, FD_CLOEXEC)          = 0
dup2(1, 2)                              = 2
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27314
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27314
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27314, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 27314
dup2(13, 1)                             = 1
close(13)                               = 0
dup2(14, 2)                             = 2
close(14)                               = 0
write(1, "[....] ", 7[....] )                  = 7
write(1, "Reloading icinga2 configuration "..., 64Reloading icinga2 configuration (via systemctl): icinga2.service) = 64
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27315
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27315
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27315, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 27315
ioctl(1, TCGETS, {B38400 opost isig icanon echo ...}) = 0
geteuid()                               = 0
stat("/usr/bin/tput", {st_mode=S_IFREG|0755, st_size=18456, ...}) = 0
faccessat(AT_FDCWD, "/usr/bin/tput", X_OK) = 0
geteuid()                               = 0
stat("/usr/bin/expr", {st_mode=S_IFREG|0755, st_size=43848, ...}) = 0
faccessat(AT_FDCWD, "/usr/bin/expr", X_OK) = 0
open("/dev/null", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
fcntl(1, F_DUPFD, 10)                   = 13
close(1)                                = 0
fcntl(13, F_SETFD, FD_CLOEXEC)          = 0
dup2(3, 1)                              = 1
close(3)                                = 0
fcntl(2, F_DUPFD, 10)                   = 14
close(2)                                = 0
fcntl(14, F_SETFD, FD_CLOEXEC)          = 0
dup2(1, 2)                              = 2
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27381
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27381
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27381, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 27381
dup2(13, 1)                             = 1
close(13)                               = 0
dup2(14, 2)                             = 2
close(14)                               = 0
open("/dev/null", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
fcntl(1, F_DUPFD, 10)                   = 13
close(1)                                = 0
fcntl(13, F_SETFD, FD_CLOEXEC)          = 0
dup2(3, 1)                              = 1
close(3)                                = 0
fcntl(2, F_DUPFD, 10)                   = 14
close(2)                                = 0
fcntl(14, F_SETFD, FD_CLOEXEC)          = 0
dup2(1, 2)                              = 2
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27382
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27382
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27382, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 27382
dup2(13, 1)                             = 1
close(13)                               = 0
dup2(14, 2)                             = 2
close(14)                               = 0
pipe([3, 4])                            = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27383
close(4)                                = 0
read(3, "\33[31m", 128)                 = 5
read(3, "", 128)                        = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27383, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 0
close(3)                                = 0
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27383
pipe([3, 4])                            = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27384
close(4)                                = 0
read(3, "\33[32m", 128)                 = 5
read(3, "", 128)                        = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27384, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 0
close(3)                                = 0
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27384
pipe([3, 4])                            = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27385
close(4)                                = 0
read(3, "\33[33m", 128)                 = 5
read(3, "", 128)                        = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27385, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 0
close(3)                                = 0
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27385
pipe([3, 4])                            = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27386
close(4)                                = 0
read(3, "\33[39;49m", 128)              = 8
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27386, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 94820858675199
read(3, "", 128)                        = 0
close(3)                                = 0
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27386
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27387
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27387
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27387, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 27387
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27388
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27388
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27388, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 27388
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27389
[{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27389
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27389, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 27389
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27390
wait4(-1, [ ok [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27390
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27390, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 27390
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27391
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27391
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27391, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 27391
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27392
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27392
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27392, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 27392
ioctl(1, TCGETS, {B38400 opost isig icanon echo ...}) = 0
geteuid()                               = 0
stat("/usr/bin/tput", {st_mode=S_IFREG|0755, st_size=18456, ...}) = 0
faccessat(AT_FDCWD, "/usr/bin/tput", X_OK) = 0
geteuid()                               = 0
stat("/usr/bin/expr", {st_mode=S_IFREG|0755, st_size=43848, ...}) = 0
faccessat(AT_FDCWD, "/usr/bin/expr", X_OK) = 0
open("/dev/null", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
fcntl(1, F_DUPFD, 10)                   = 13
close(1)                                = 0
fcntl(13, F_SETFD, FD_CLOEXEC)          = 0
dup2(3, 1)                              = 1
close(3)                                = 0
fcntl(2, F_DUPFD, 10)                   = 14
close(2)                                = 0
fcntl(14, F_SETFD, FD_CLOEXEC)          = 0
dup2(1, 2)                              = 2
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27393
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27393
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27393, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 27393
dup2(13, 1)                             = 1
close(13)                               = 0
dup2(14, 2)                             = 2
close(14)                               = 0
open("/dev/null", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
fcntl(1, F_DUPFD, 10)                   = 13
close(1)                                = 0
fcntl(13, F_SETFD, FD_CLOEXEC)          = 0
dup2(3, 1)                              = 1
close(3)                                = 0
fcntl(2, F_DUPFD, 10)                   = 14
close(2)                                = 0
fcntl(14, F_SETFD, FD_CLOEXEC)          = 0
dup2(1, 2)                              = 2
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27395
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27395
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27395, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 27395
dup2(13, 1)                             = 1
close(13)                               = 0
dup2(14, 2)                             = 2
close(14)                               = 0
pipe([3, 4])                            = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27396
close(4)                                = 0
read(3, "\33[31m", 128)                 = 5
read(3, "", 128)                        = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27396, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 0
close(3)                                = 0
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27396
pipe([3, 4])                            = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27397
close(4)                                = 0
read(3, "\33[33m", 128)                 = 5
read(3, "", 128)                        = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27397, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 0
close(3)                                = 0
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27397
pipe([3, 4])                            = 0
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7fab5c6479d0) = 27398
close(4)                                = 0
read(3, "\33[39;49m", 128)              = 8
read(3, "", 128)                        = 0
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=27398, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
rt_sigreturn({mask=[]})                 = 0
close(3)                                = 0
wait4(-1, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27398
write(1, ".\n", 2.
)                      = 2
close(12)                               = 0
close(11)                               = 0
exit_group(0)                           = ?
+++ exited with 0 +++

@Crunsher
Copy link
Contributor

We figured it out: You are using an old init script!
Why? Because we package out init scrips on Debian separately and we forgot to update 💃

@Crunsher Crunsher added the area/setup Installation, systemd, sample files label Feb 15, 2018
@Crunsher Crunsher added this to the 2.8.2 milestone Feb 16, 2018
@lazyfrosch
Copy link
Contributor

lazyfrosch commented Feb 19, 2018

This is not an old script, it's the init script for Debian based systems. We need to figure out the issue and fix it in Icinga/deb-icinga2#2

Edit: Please discuss there

@Crunsher
Copy link
Contributor

Sadly @lazyfrosch is right, this is not related to our changes. But since this is directly related to the debian init scripts, I'm closing this in favor of Icinga/deb-icinga2#4

@lazyfrosch lazyfrosch reopened this Feb 19, 2018
@lazyfrosch
Copy link
Contributor

Let's come back to the original problem, since I can't reproduce it with Debian stretch in sysV init mode.

I've installed a fresh Debian stretch, rebooted it with sysV and configured Icinga 2 + IDO MySQL

root@debian:~# icinga2 --version
icinga2 - The Icinga 2 network monitoring daemon (version: v2.8.1-430-gc7ae986d9)

Copyright (c) 2012-2018 Icinga Development Team (https://www.icinga.com/)
License GPLv2+: GNU GPL version 2 or later <http://gnu.org/licenses/gpl2.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Application information:
  Installation root: /usr
  Sysconf directory: /etc
  Run directory: /run
  Local state directory: /var
  Package data directory: /usr/share/icinga2
  State path: /var/lib/icinga2/icinga2.state
  Modified attributes path: /var/lib/icinga2/modified-attributes.conf
  Objects path: /var/cache/icinga2/icinga2.debug
  Vars path: /var/cache/icinga2/icinga2.vars
  PID path: /run/icinga2/icinga2.pid

System information:
  Platform: Debian GNU/Linux
  Platform version: 9 (stretch)
  Kernel: Linux
  Kernel version: 4.9.0-4-amd64
  Architecture: x86_64

Build information:
  Compiler: GNU 6.3.0
  Build host: 4451229ca030

root@debian:~# cat /etc/debian_version 
9.3

root@debian:~# ps -ef | head -n5
UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  0 16:05 ?        00:00:00 init [2]
root         2     0  0 16:05 ?        00:00:00 [kthreadd]
root         3     2  0 16:05 ?        00:00:00 [ksoftirqd/0]
root         5     2  0 16:05 ?        00:00:00 [kworker/0:0H]

root@debian:~# dpkg -S /sbin/init
sysvinit-core: /sbin/init

Now let's see how restarting does:

root@debian:~# /etc/init.d/icinga2 start
[ ok ] checking Icinga2 configuration.
[....] Starting icinga2 monitoring daemon: icinga2[2018-02-19 16:52:33 +0100] information/cli: Icinga application loader (version: v2.8.1-430-gc7ae986d9)
[2018-02-19 16:52:33 +0100] information/cli: Loading configuration file(s).
[2018-02-19 16:52:33 +0100] information/ConfigItem: Committing config item(s).
[2018-02-19 16:52:33 +0100] information/ApiListener: My API identity: debian
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 12 Services.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 3 ServiceGroups.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 1 ScheduledDowntime.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 2 HostGroups.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 1 Downtime.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 1 NotificationComponent.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 2 NotificationCommands.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 13 Notifications.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 1 IcingaApplication.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 1 ApiUser.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 1 Host.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 1 ApiListener.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 1 FileLogger.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 1 CheckerComponent.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 3 Zones.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 1 Endpoint.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 1 UserGroup.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 1 IdoMysqlConnection.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 212 CheckCommands.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 3 TimePeriods.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Instantiated 1 User.
[2018-02-19 16:52:33 +0100] information/ScriptGlobal: Dumping variables to file '/var/cache/icinga2/icinga2.vars'
[2018-02-19 16:52:33 +0100] information/ConfigObject: Restoring program state from file '/var/lib/icinga2/icinga2.state'
[2018-02-19 16:52:33 +0100] information/ConfigObject: Restored 264 objects. Loaded 0 new objects without state.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Triggering Start signal for config items
[2018-02-19 16:52:33 +0100] information/NotificationComponent: 'notification' started.
[2018-02-19 16:52:33 +0100] information/ApiListener: 'api' started.
[2018-02-19 16:52:33 +0100] information/ApiListener: Adding new listener on port '5665'
[2018-02-19 16:52:33 +0100] information/CheckerComponent: 'checker' started.
[2018-02-19 16:52:33 +0100] information/DbConnection: 'ido-mysql' started.
[2018-02-19 16:52:33 +0100] information/ConfigItem: Activated all objects.
[.ok 
root@debian:~# ps -ef | grep icinga2
nagios   14901     1  0 16:52 pts/0    00:00:00 /usr/lib/x86_64-linux-gnu/icinga2/sbin/icinga2 --no-stack-rlimit daemon -d -e /var/log/icinga2/icinga2.err
nagios   14904     1  0 16:52 ?        00:00:00 /usr/lib/x86_64-linux-gnu/icinga2/sbin/icinga2 --no-stack-rlimit daemon -d -e /var/log/icinga2/icinga2.err
root     14932  7167  0 16:52 pts/0    00:00:00 grep icinga2
root@debian:~# /etc/init.d/icinga2 reload
[ ok ] checking Icinga2 configuration.
[ ok ] icinga2 is running.
[ ok ] Reloading icinga2 monitoring daemon: icinga2.
root@debian:~# ps -ef | grep icinga2
nagios   15034     1  5 16:52 ?        00:00:00 /usr/lib/x86_64-linux-gnu/icinga2/sbin/icinga2 --no-stack-rlimit daemon -d -e /var/log/icinga2/icinga2.err --reload-internal 14904
nagios   15050 15034  0 16:52 ?        00:00:00 /usr/lib/x86_64-linux-gnu/icinga2/sbin/icinga2 --no-stack-rlimit daemon -d -e /var/log/icinga2/icinga2.err --reload-internal 14904
root     15059  7167  0 16:52 pts/0    00:00:00 grep icinga2
root@debian:~# /etc/init.d/icinga2 status
[ ok ] icinga2 is running.

@Crunsher Were you able to reproduce the error?

@tmatthaeus what might be different in my setup compared to yours?

@Crunsher Crunsher removed this from the 2.8.2 milestone Feb 21, 2018
@lazyfrosch
Copy link
Contributor

@tmatthaeus have you had this problem on any other Stretch/sysV system? I can't reproduce on a fresh install.

@lazyfrosch lazyfrosch assigned lazyfrosch and unassigned Crunsher Feb 25, 2018
@lazyfrosch
Copy link
Contributor

Okay so it doesn't look like OP is using sysV init, on closer inspection of his strace, it is systemd.

So the initscript doesn't matter at all...

To my surprise icinga2 is really dying during a reload with systemd:

root@debian:~# systemctl status icinga2.service 
● icinga2.service - Icinga host/service/network monitoring system
   Loaded: loaded (/lib/systemd/system/icinga2.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/icinga2.service.d
           └─limits.conf
   Active: inactive (dead) since Mon 2018-02-26 14:14:43 CET; 794ms ago
  Process: 1335 ExecReload=/usr/lib/icinga2/safe-reload /usr/lib/icinga2/icinga2 (code=exited, status=0/SUCCESS)
  Process: 1288 ExecStart=/usr/sbin/icinga2 daemon -e ${ICINGA2_ERROR_LOG} (code=exited, status=0/SUCCESS)
  Process: 1201 ExecStartPre=/usr/lib/icinga2/prepare-dirs /usr/lib/icinga2/icinga2 (code=exited, status=0/SUCCESS)
 Main PID: 1288 (code=exited, status=0/SUCCESS)

Feb 26 14:14:43 debian icinga2[1288]: [2018-02-26 14:14:11 +0100] information/IdoMysqlConnection: 'ido-mysql' resumed.
Feb 26 14:14:43 debian icinga2[1288]: [2018-02-26 14:14:11 +0100] information/IdoMysqlConnection: MySQL IDO instance id: 1 (schema version: '1.14.3')
Feb 26 14:14:43 debian icinga2[1288]: [2018-02-26 14:14:11 +0100] information/IdoMysqlConnection: Finished reconnecting to MySQL IDO database in 0.0316799 second(s).
Feb 26 14:14:43 debian icinga2[1288]: [2018-02-26 14:14:21 +0100] information/WorkQueue: #5 (ApiListener, RelayQueue) items: 0, rate: 0.25/s (15/min 15/5min 15/15min);
Feb 26 14:14:43 debian icinga2[1288]: [2018-02-26 14:14:21 +0100] information/WorkQueue: #6 (ApiListener, SyncQueue) items: 0, rate:  0/s (0/min 0/5min 0/15min);
Feb 26 14:14:43 debian icinga2[1288]: [2018-02-26 14:14:21 +0100] information/WorkQueue: #7 (IdoMysqlConnection, ido-mysql) items: 6, rate: 0.35/s (21/min 21/5min 21/15min);
Feb 26 14:14:43 debian icinga2[1288]: [2018-02-26 14:14:31 +0100] information/WorkQueue: #7 (IdoMysqlConnection, ido-mysql) items: 6, rate: 0.8/s (48/min 48/5min 48/15min);
Feb 26 14:14:43 debian icinga2[1288]: [2018-02-26 14:14:41 +0100] information/WorkQueue: #7 (IdoMysqlConnection, ido-mysql) items: 6, rate: 1.25/s (75/min 75/5min 75/15min);
Feb 26 14:14:43 debian icinga2[1288]: [2018-02-26 14:14:43 +0100] information/Application: Got reload command: Starting new instance.
Feb 26 14:14:43 debian icinga2[1288]: [2018-02-26 14:14:43 +0100] information/Application: Reload requested, letting new process take over.

This is currently only a problem with snapshots it seems.

Yes the systemd file for Debian/Ubuntu differs, but I'm not sure why systemd is loosing the daemon here...

@lazyfrosch
Copy link
Contributor

I will switch the systemd script for Debian and Ubuntu to the one included with Icinga 2.

Meanwhile we should make sure that the daemon updates the PID file correctly:

root@debian:~# systemctl restart icinga2.service 
root@debian:~# ls -al /run/icinga2/icinga2.pid
-rw-rw---- 1 nagios nagios 5 Feb 26 14:54 /run/icinga2/icinga2.pid
root@debian:~# systemctl reload icinga2.service 
root@debian:~# ls -al /run/icinga2/icinga2.pid
ls: cannot access '/run/icinga2/icinga2.pid': No such file or directory

@Crunsher Could you have a look at this?

@lazyfrosch lazyfrosch changed the title No icinga2 process after reload PID file removed after reload Feb 26, 2018
@lazyfrosch
Copy link
Contributor

Note: Affects current master starting with c418a96 and probably 2.8.2 then

@Crunsher
Copy link
Contributor

It doesn't happen in my custom debug build. So it is likely there might be some permission problem or similar. I'll see to this getting looked at before 2.8.2 hits.

jflach@jfws ~/git/icinga2/build$ sudo systemctl start icinga2                                                                                                                            ✭ master 
jflach@jfws ~/git/icinga2/build$ sudo systemctl reload icinga2                                                                                                                           ✭ master 
jflach@jfws ~/git/icinga2/build$ sudo systemctl status icinga2                                                                                                                           ✭ master 
● icinga2.service - Icinga host/service/network monitoring system
   Loaded: loaded (/usr/lib/systemd/system/icinga2.service; disabled; vendor preset: enabled)
   Active: active (running) since Tue 2018-02-27 10:38:09 CET; 12s ago
  Process: 21546 ExecReload=/home/jflach/i2/lib/icinga2/safe-reload /home/jflach/i2/etc/sysconfig/icinga2 (code=exited, status=0/SUCCESS)
  Process: 21396 ExecStartPre=/home/jflach/i2/lib/icinga2/prepare-dirs /home/jflach/i2/etc/sysconfig/icinga2 (code=exited, status=0/SUCCESS)
 Main PID: 21615 (icinga2)
    Tasks: 16 (limit: 4915)
   CGroup: /system.slice/icinga2.service
           ├─21615 /home/jflach/i2/lib/icinga2/sbin/icinga2 --no-stack-rlimit daemon -e $ICINGA2_LOG_DIR/error.log --reload-internal 21400
           └─21644 /home/jflach/i2/lib/icinga2/sbin/icinga2 --no-stack-rlimit daemon -e $ICINGA2_LOG_DIR/error.log --reload-internal 21400

Feb 27 10:38:16 jfws icinga2[21400]: [2018-02-27 10:38:09 +0100] information/ConfigItem: Activated all objects.
Feb 27 10:38:16 jfws icinga2[21400]: [2018-02-27 10:38:12 +0100] critical/TcpSocket: Invalid socket: No route to host
Feb 27 10:38:16 jfws icinga2[21400]: [2018-02-27 10:38:12 +0100] critical/ApiListener: Cannot connect to host '192.168.225.200' on port '5665'
Feb 27 10:38:16 jfws icinga2[21400]: [2018-02-27 10:38:12 +0100] information/ApiListener: Finished reconnecting to endpoint 'WingdingsII' via host '192.168.225.200' and port '5665'
Feb 27 10:38:16 jfws icinga2[21400]: [2018-02-27 10:38:16 +0100] information/Application: Got reload command: Starting new instance.
Feb 27 10:38:16 jfws icinga2[21400]: [2018-02-27 10:38:16 +0100] information/Application: Reload requested, letting new process take over.
Feb 27 10:38:16 jfws systemd[1]: icinga2.service: Supervising process 21615 which is not our child. We'll most likely not notice when it exits.
Feb 27 10:38:17 jfws systemd[1]: Reloaded Icinga host/service/network monitoring system.
Feb 27 10:38:20 jfws icinga2[21615]: Invalid socket: No route to host
Feb 27 10:38:20 jfws icinga2[21615]: Cannot connect to host '192.168.225.200' on port '5665'
jflach@jfws ~/git/icinga2/build$ stat ~/i2/var/run/icinga2/icinga2.pid                                                                                                                   ✭ master 
  File: /home/jflach/i2/var/run/icinga2/icinga2.pid
  Size: 6         	Blocks: 8          IO Block: 4096   regular file
Device: fe01h/65025d	Inode: 6823849     Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1000/  jflach)   Gid: ( 1000/  jflach)
Access: 2018-02-27 10:38:16.756144593 +0100
Modify: 2018-02-27 10:38:17.008146410 +0100
Change: 2018-02-27 10:38:17.008146410 +0100
 Birth: -
jflach@jfws ~/git/icinga2/build$ icinga2 --version                                                                                                                                       ✭ master 
icinga2 - The Icinga 2 network monitoring daemon (version: v2.8.1-488-g98bcca5e1; debug)

@Crunsher Crunsher modified the milestones: 2.8.2, 2.9.0 Feb 27, 2018
@Crunsher
Copy link
Contributor

Update: I thought this was a 2.8.2 issue but it's not 🤷‍♀️

@lazyfrosch
Copy link
Contributor

I think the core problem with this issue is that we do not longer update the PID file on SIGUSR2 takeover.

Can we discuss this tomorrow in person? I'd call Stop() inside SigUsr2 so the PID file is updated before exiting the prior daemon. (Current takeover for systemd and sysV)

@lazyfrosch lazyfrosch reopened this Mar 11, 2018
@Crunsher
Copy link
Contributor

Crunsher commented Mar 13, 2018

I am not able to reproduce this:

jflach@jfws ~/git/icinga2/build$ ~/i2/etc/init.d/icinga2 start                                                                                                                     
Checking configuration: Done
Starting Icinga 2: Done
jflach@jfws ~/git/icinga2/build$ ~/i2/etc/init.d/icinga2 status                                                                                                                          
Icinga 2 status: Running
jflach@jfws ~/git/icinga2/build$ cat ~/i2/var/run/icinga2/icinga2.pid                                                                                                                    
25409
jflach@jfws ~/git/icinga2/build$ ~/i2/etc/init.d/icinga2 reload                                                                                                                           
Validating config files: Done
Reloading Icinga 2: Done
jflach@jfws ~/git/icinga2/build$ cat ~/i2/var/run/icinga2/icinga2.pid                                                                                                                    
25771
jflach@jfws ~/git/icinga2/build$ ~/i2/etc/init.d/icinga2 status                                                                                           
Icinga 2 status: Running
jflach@jfws ~/git/icinga2/build$ icinga2 version                                                                                                                                         
icinga2 - The Icinga 2 network monitoring daemon (version: v2.8.1-526-g9b0fccfd8; debug)

We can easily replace our call to Exit with a call to Stop, though I'd like a way to test this first.

@lazyfrosch
Copy link
Contributor

The user is using systemd! systemd is expecting that the PID file is updated before the old daemon is exiting.

We seem to no longer do that. So the "old" systemd unit fails.

But we should make sure this still works, despite updating the unit file...

@Crunsher
Copy link
Contributor

With which, as I mentioned earlier, I'm unable to reproduce this either.
After talking with @gunnarbeutner this seems to be timing related. I'll investigate further and see if Stop() is enough or if we need to take additional care to make sure the new PID file is written before we kill the old process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/setup Installation, systemd, sample files bug Something isn't working
Projects
None yet
5 participants