Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hosts disappears in Web after deployment #7024

Closed
BastiBr opened this issue Mar 18, 2019 · 7 comments
Closed

Hosts disappears in Web after deployment #7024

BastiBr opened this issue Mar 18, 2019 · 7 comments
Labels
area/db-ido Database output bug Something isn't working
Milestone

Comments

@BastiBr
Copy link

BastiBr commented Mar 18, 2019

Expected Behavior

After an deployment in the director, the icingaweb2 should display all hosts as normal.

Current Behavior

After an deployment in the director, the hosts disappears in the web. It takes some seconds before the hosts are not visible.

After 1-2 minutes, the hosts are visible again.

Possible Solution

After some talks on the Icinga Camp Berlin, it could be that the Icinga Core is setting the Hosts on inactive in the DB after an deployment. (For this reason a bug ticket in the core, not web. If misrouted please give notice)

Steps to Reproduce (for bugs)

  1. Deploy config over the director module
  2. Go to Overview - Hosts
  3. Wait a few seconds
  4. No Hosts are visble in the list
  5. If you are on a specific host view, the following error appears.
#0 /usr/share/php/Icinga/Exception/IcingaException.php(41): ReflectionClass->newInstanceArgs(Array)
#1 /usr/share/php/Icinga/Web/Controller.php(87): Icinga\Exception\IcingaException::create(Array)
#2 /usr/share/icingaweb2/modules/monitoring/application/controllers/HostController.php(33): Icinga\Web\Controller->httpNotFound(String)
#3 /usr/share/php/Icinga/Web/Controller/ActionController.php(152): Icinga\Module\Monitoring\Controllers\HostController->init()
#4 /usr/share/php/Icinga/Web/Controller/Dispatcher.php(59): Icinga\Web\Controller\ActionController->__construct(Object(Icinga\Web\Request), Object(Icinga\Web\Response), Array)
#5 /usr/share/icingaweb2/library/vendor/Zend/Controller/Front.php(937): Icinga\Web\Controller\Dispatcher->dispatch(Object(Icinga\Web\Request), Object(Icinga\Web\Response))
#6 /usr/share/php/Icinga/Application/Web.php(300): Zend_Controller_Front->dispatch(Object(Icinga\Web\Request), Object(Icinga\Web\Response))
#7 /usr/share/php/Icinga/Application/webrouter.php(104): Icinga\Application\Web->dispatch()
#8 /usr/share/icingaweb2/public/index.php(4): require_once(String)
#9 {main}

Context

Frequently deployments restrict the work on the web for many teams, when the hosts are not visible for minutes.

Your Environment

  • The Web + Director is running on one master node.
  • The DB is running on a separate cluster
  • Version used (icinga2 --version): r2.10.3-1
  • Operating System and version: Ubuntu 18.04.2 LTS (Bionic Beaver)
  • Enabled features (icinga2 feature list): api checker command ido-mysql mainlog notification
  • Icinga Web 2 version and modules (System - About):
 Icingaweb2 | 2.6.2
businessprocess | 2.1.0
director | master
doc | 2.6.2
elasticsearch | 1.0.0
grafana | 1.3.1
ipl | v0.1.1
monitoring | 2.6.2
reactbundle | v0.4.1
x509 | 1.0.0
  • Config validation (icinga2 daemon -C):
[2019-03-18 09:49:02 +0100] information/cli: Icinga application loader (version: r2.10.3-1)
[2019-03-18 09:49:02 +0100] information/cli: Loading configuration file(s).
[2019-03-18 09:49:02 +0100] information/ConfigItem: Committing config item(s).
[2019-03-18 09:49:02 +0100] information/ApiListener: My API identity: op-icn2mas-p101.domain.int
[2019-03-18 09:49:08 +0100] information/ConfigItem: Instantiated 20380 Services.
[2019-03-18 09:49:08 +0100] information/ConfigItem: Instantiated 1 IcingaApplication.
[2019-03-18 09:49:08 +0100] information/ConfigItem: Instantiated 1925 Hosts.
[2019-03-18 09:49:08 +0100] information/ConfigItem: Instantiated 1 EventCommand.
[2019-03-18 09:49:08 +0100] information/ConfigItem: Instantiated 1 FileLogger.
[2019-03-18 09:49:08 +0100] information/ConfigItem: Instantiated 18478 Dependencies.
[2019-03-18 09:49:08 +0100] information/ConfigItem: Instantiated 2 NotificationCommands.
[2019-03-18 09:49:08 +0100] information/ConfigItem: Instantiated 20886 Notifications.
[2019-03-18 09:49:08 +0100] information/ConfigItem: Instantiated 1 NotificationComponent.
[2019-03-18 09:49:08 +0100] information/ConfigItem: Instantiated 23 HostGroups.
[2019-03-18 09:49:08 +0100] information/ConfigItem: Instantiated 1 ApiListener.
[2019-03-18 09:49:08 +0100] information/ConfigItem: Instantiated 20 Downtimes.
[2019-03-18 09:49:08 +0100] information/ConfigItem: Instantiated 12 Comments.
[2019-03-18 09:49:08 +0100] information/ConfigItem: Instantiated 1 CheckerComponent.
[2019-03-18 09:49:08 +0100] information/ConfigItem: Instantiated 1373 Zones.
[2019-03-18 09:49:08 +0100] information/ConfigItem: Instantiated 1 ExternalCommandListener.
[2019-03-18 09:49:08 +0100] information/ConfigItem: Instantiated 1373 Endpoints.
[2019-03-18 09:49:08 +0100] information/ConfigItem: Instantiated 6 ApiUsers.
[2019-03-18 09:49:08 +0100] information/ConfigItem: Instantiated 499 Users.
[2019-03-18 09:49:08 +0100] information/ConfigItem: Instantiated 1 IdoMysqlConnection.
[2019-03-18 09:49:08 +0100] information/ConfigItem: Instantiated 257 CheckCommands.
[2019-03-18 09:49:08 +0100] information/ConfigItem: Instantiated 7 UserGroups.
[2019-03-18 09:49:08 +0100] information/ConfigItem: Instantiated 4 TimePeriods.
[2019-03-18 09:49:08 +0100] information/ScriptGlobal: Dumping variables to file '/var/cache/icinga2/icinga2.vars'
[2019-03-18 09:49:08 +0100] information/cli: Finished validating the configuration file(s).
  • If you run multiple Icinga 2 instances, the zones.conf file (or icinga2 object list --type Endpoint and icinga2 object list --type Zone) from all affected nodes.
object Endpoint "op-icn2mas-p101.domain.int" {
}
object Endpoint "op-icn2mas-p102.domain.int" {
        host = "IP"
}
object Zone "master-zone" {
        endpoints = [ "op-icn2mas-p101.domain.int", "op-icn2mas-p102.domain.int" ]
}
object Zone "global-templates" {
        global = true
}
object Zone "director-global" {
        global = true
}
object Endpoint "op-icn2pci-p101.domain.int" {
        host = "IP"
}
object Endpoint "op-icn2pci-p102.domain.int" {
        host = "IP"
}
object Zone "pci-zone" {
        parent = "master-zone"
        endpoints = [ "op-icn2pci-p101.domain.int", "op-icn2pci-p102.domain.int" ]
}
@dnsmichi
Copy link
Contributor

I've fixed something for 2.11 which might solve this, specifically #6970.

@dnsmichi dnsmichi added the area/db-ido Database output label Mar 18, 2019
@BastiBr
Copy link
Author

BastiBr commented Mar 18, 2019

@dnsmichi Thank you, that could be the fix. I will test it when 2.11 is released.

@dnsmichi
Copy link
Contributor

You can test the snapshot packages, e.g. within the Vagrant boxes.

@dnsmichi dnsmichi added the needs feedback We'll only proceed once we hear from you again label Mar 18, 2019
@BastiBr
Copy link
Author

BastiBr commented Mar 19, 2019

I testet it with the version 2.10.2+257.g856d3a1b4.2019.02.23+1.bionic-0 and at the moment i cannot reproduce the error.

So it seems that your fix worked. Thank you @dnsmichi

@dnsmichi dnsmichi added this to the 2.11.0 milestone Mar 19, 2019
@dnsmichi dnsmichi added the bug Something isn't working label Mar 19, 2019
@dnsmichi dnsmichi removed the needs feedback We'll only proceed once we hear from you again label Mar 19, 2019
@dnsmichi
Copy link
Contributor

Ok, thanks for testing. I'll mark this as resolved for 2.11 then. The patch is a bit tad larger, so I cannot backport to 2.10 for now.

@lippserd
Copy link
Member

@BastiBr I just wanted to double check whether the error is gone. If not, please check #7125. You could help with debug logs from a restart where your host objects get disabled.

@BastiBr
Copy link
Author

BastiBr commented Apr 24, 2019

@lippserd We testet it with the snapshot build above. Because the behavior only appears on our big production environment we tested this on prod. With this build we could not reproduce the error.
(We tested it two days)

After testing we go back on a stable release because we were facing some stability issues on the masters with the snapshot build.

#7125 May also test with 2.11 snapshot build?

Cheers,
Basti

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/db-ido Database output bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants