Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[2.10.2] Director deploy crashes the Icinga service [FreeBSD] #6807

Closed
Stefar77 opened this issue Nov 29, 2018 · 10 comments
Closed

[2.10.2] Director deploy crashes the Icinga service [FreeBSD] #6807

Stefar77 opened this issue Nov 29, 2018 · 10 comments
Labels
area/api REST API bug Something isn't working
Milestone

Comments

@Stefar77
Copy link
Contributor

Stefar77 commented Nov 29, 2018

With Icinga 2.10.2-1 when somebody presses deploy in Icinga director the Icinga2 services fails to start the new process.

Failure: Address already in use on the API socket.

Current Behavior

Use deploy in director and after a couple of seconds the newly created Icinga process will crash with a address in use error.

Possible Solution

Wait a bit longer when starting the new listener of retry when getting the address in use error on reload.
I didn't take the time to see how/when the old process closes the socket so I just added a sleep(1); before creating the primary JSON-RPC listener and that fixed the problem for now. (remote/apilistener.cpp:253)

Steps to Reproduce (for bugs)

  1. Install Icingaweb2 and Director (on FreeBSD 11.2 in a jail)
  2. Press deploy

Context

Upgraded from 2.8 to 2.10.2-1 and next morning we noticed the bug so I tried to set bind_host and bind_port but that didn't seem to help. I ended up patching the source code a bit.

Your Environment

  • Version used (icinga2 --version): 2.10.2
  • Operating System and version: FreeBSD 11.2
  • Enabled features (icinga2 feature list): api checker command graphite ido-mysql livestatus mainlog notification syslog
  • Icinga Web 2 version and modules (System - About): 2.6.2
  • Config validation (icinga2 daemon -C): no problems
@Crunsher
Copy link
Contributor

Crunsher commented Dec 3, 2018

Can you also reproduce this without Director, by sending SIGHUP to the Icinga process?

@dnsmichi
Copy link
Contributor

dnsmichi commented Dec 5, 2018

Sounds a bit like #6815, can you test that patch please? :)

@dnsmichi dnsmichi added the area/api REST API label Dec 5, 2018
@Stefar77
Copy link
Contributor Author

Stefar77 commented Dec 5, 2018

@dnsmichi That didn't help, still quits when I deploy
@Crunsher Yes it just makes a Icinga process go to 2000% for a bit and quits all Icinga processes.

I'll add the sleep(1); again for now.

@Stefar77
Copy link
Contributor Author

Stefar77 commented Dec 5, 2018

Also noticed 2.10.2 on Windows (agents) then you start the daemon from console and press CTRL-C it will hang and spam Socket errors until you close the command prompt.

image

@dnsmichi
Copy link
Contributor

dnsmichi commented Dec 5, 2018

That's a different problem on Windows, you cannot run it in foreground in cmd. That's tracked with #3029.

@bmccorkle
Copy link

I'm experiencing this same issue since updating Icinga2 recently.

FreeBSD 11.2
Icinga2 2.10.1

Disabled features: compatlog elasticsearch gelf graphite livestatus opentsdb perfdata statusdata syslog
Enabled features: api checker command debuglog ido-mysql influxdb mainlog notification

@mat813
Copy link
Contributor

mat813 commented Dec 26, 2018

I do not use the director, and I have been experiencing the same issue ever since 2.10.0.

In the logs, I get:

[2018-12-26 16:46:01 +0100] critical/TcpSocket: Invalid socket: Address already in use
Context:
        (0) Activating object 'api' of type 'ApiListener'

[2018-12-26 16:46:01 +0100] critical/ApiListener: Cannot bind TCP socket for host '' on port '5665'.
Context:
        (0) Activating object 'api' of type 'ApiListener'

[2018-12-26 16:46:01 +0100] critical/ApiListener: Cannot add listener on host '' for port '5665'.
Context:
        (0) Activating object 'api' of type 'ApiListener'

To mitigate the issue, I changed our internal documentation saying to run service icinga2 restart instead of reload on the master, and then to go on the salt master and run salt '*' service.start icinga2. It feels like it tries to restart the API before its previous instance being stopped, or something.

@dnsmichi
Copy link
Contributor

dnsmichi commented Jan 7, 2019

It is a problem with re-using the sockets and allowing incoming API requests which block on shutdown. Their handling was changed with 2.10. Can only be seen in environments with many parallel requests and reloads.

@dnsmichi dnsmichi added the bug Something isn't working label Jan 7, 2019
@dnsmichi dnsmichi added this to the 2.11.0 milestone Jan 7, 2019
@mat813
Copy link
Contributor

mat813 commented Jan 7, 2019

Mmmm, I do have ~50 satellites, but I do not reload often. Though, it happens on every reload.

@dnsmichi
Copy link
Contributor

#6898 and #6901 fixes this, seems to be a problem specifically with *BSD.

@dnsmichi dnsmichi modified the milestones: 2.11.0, 2.10.3 Feb 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api REST API bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants