Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG Report: Downtime and Comments not able to parse livestatus output #643

Open
tjyang opened this issue Mar 25, 2018 · 21 comments
Open

BUG Report: Downtime and Comments not able to parse livestatus output #643

tjyang opened this issue Mar 25, 2018 · 21 comments
Assignees

Comments

@tjyang
Copy link

tjyang commented Mar 25, 2018

  • WHAT :
InvalidResponseFromLivestatus: Could not parse response from livestatus. \
Query:GET downtimes ResponseHeader: fixed16 OutputFormat: python ColumnHeaders: on Response: 
  • HOW: query on Aadagio downtime.
  • Environment info
Latest CentOS 7.7
adagios 1.6.3.
Nagios 4.3.4
check-mk-livestatus-1.2.8 and above
  File "/usr/lib/python2.7/site-packages/adagios/views.py", line 43, in wrapper
    result = view_func(request, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/adagios/status/views.py", line 971, in downtime_list
    c['downtimes'] = l.query('GET downtimes', *args)
  File "/usr/lib/python2.7/site-packages/pynag/Parsers/multisite.py", line 80, in query
    query_result = backend_instance.query(query, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/pynag/Parsers/livestatus.py", line 996, in query
    raise InvalidResponseFromLivestatus(query=livestatus_query, response=response_data)
InvalidResponseFromLivestatus: Could not parse response from livestatus.
Query:GET downtimes
ResponseHeader: fixed16
OutputFormat: python
ColumnHeaders: on


Response: [[u"author",u"comment",u"duration",u"end_time",u"entry_time",u"fixed",u"host_accept_passive_checks",u"host_acknowledged",u"host_acknowledgement_type",u"host_action_url",u"host_action_url_expanded",u"host_active_checks_enabled",u"host_address",u"host_alias",u"host_check_command",u"host_check_command_expanded",u"host_check_flapping_recovery_notification",u"host_check_freshness",u"host_check_interval",u"host_check_options",u"host_check_period",u"host_check_type",u"host_checks_enabled",u"host_childs",u"host_comments",u"host_comments_with_extra_info",u"host_comments_with_info",u"host_contact_groups",u"host_contacts",u"host_current_attempt",u"host_current_notification_number",u"host_custom_variable_names",u"host_custom_variable_values",u"host_custom_variables",u"host_display_name",u"host_downtimes",u"host_downtimes_with_info",u"host_event_handler",u"host_event_handler_enabled",u"host_execution_time",u"host_filename",u"host_first_notification_delay",u"host_flap_detection_enabled",u"host_groups",u"host_hard_state",u"host_has_been_checked",u"host_high_flap_threshold",u"host_icon_image",u"host_icon_image_alt",u"host_icon_image_expanded",u"host_in_check_period",u"host_in_notification_period",u"host_in_service_period",u"host_initial_state",u"host_is_executing",u"host_is_flapping",u"host_last_check",u"host_last_hard_state",u"host_last_hard_state_change",u"host_last_notification",u"host_last_state",u"host_last_state_change",u"host_last_time_down",u"host_last_time_unreachable",u"host_last_time_up",u"host_latency",u"host_long_plugin_output",u"host_low_flap_threshold",u"host_max_check_attempts",u"host_metrics",u"host_mk_inventory",u"host_mk_inventory_gz",u"host_mk_inventory_last",u"host_modified_attributes",u"host_modified_attributes_list",u"host_name",u"host_next_check",u"host_next_notification",u"host_no_more_notifications",u"host_notes",u"host_notes_expanded",u"host_notes_url",u"host_notes_url_expanded",u"host_notification_interval",u"host_notification_period",u"host_notifications_enabled",u"host_num_services",u"host_num_services_crit",u"host_num_services_hard_crit",u"host_num_services_hard_ok",u"host_num_services_hard_unknown",u"host_num_services_hard_warn",u"host_num_services_ok",u"host_num_services_pending",u"host_num_services_unknown",u"host_num_services_warn",u"host_obsess_over_host",u"host_parents",u"host_pending_flex_downtime",u"host_percent_state_change",u"host_perf_data",u"host_plugin_output",u"host_pnpgraph_present",u"host_process_performance_data",u"host_retry_interval",u"host_scheduled_downtime_depth",u"host_service_period"
@tjyang tjyang changed the title BUG Report: InvalidResponseFromLivestatus: Could not parse response from livestatus. Query:GET downtimes ResponseHeader: fixed16 OutputFormat: python ColumnHeaders: on Response: BUG Report: Downtime and Comments not able to parse livestatus output Mar 25, 2018
@Mjolinir
Copy link

Mjolinir commented May 10, 2018

I can confirm this issue still exists and seems to be on the Adagios side:

Centos 7.4 (and Centos 7.5)
adagios-1.6.3-1
nagios-4.3.4-5
check-mk-livestatus-1.2.8p26-1

Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/adagios/views.py", line 43, in wrapper result = view_func(request, *args, **kwargs) File "/usr/lib/python2.7/site-packages/adagios/status/views.py", line 971, in downtime_list c['downtimes'] = l.query('GET downtimes', *args) File "/usr/lib/python2.7/site-packages/pynag/Parsers/multisite.py", line 80, in query query_result = backend_instance.query(query, *args, **kwargs) File "/usr/lib/python2.7/site-packages/pynag/Parsers/livestatus.py", line 996, in query raise InvalidResponseFromLivestatus(query=livestatus_query, response=response_data) InvalidResponseFromLivestatus: Could not parse response from livestatus. Query:GET downtimes ResponseHeader: fixed16 OutputFormat: python ColumnHeaders: on``

livestatus is indeed loaded and working, I can verify it with the following:

echo 'GET hosts' | unixcat /var/spool/nagios/cmd/livestatus

and also via the following in the logs:

livestatus: Livestatus 1.2.8p26 by Mathias Kettner. Socket: '/var/spool/nagios/cmd/livestatus'
livestatus: Please visit us at http://mathias-kettner.de/
livestatus: Hint: please try out OMD - the Open Monitoring Distribution
livestatus: Please visit OMD at http://omdistro.org
livestatus: Finished initialization. Further log messages go to /var/log/nagios/livestatus.log
Event broker module '/usr/lib64/check_mk/livestatus.o' initialized successfully.

@gardart
Copy link
Contributor

gardart commented May 11, 2018

thank you @tjyang and @Mjolinir
what version of Pynag are you using?

@Mjolinir
Copy link

Mjolinir commented May 16, 2018

Hello gardart! I hope this is something that can be fixed relatively easy. It has been broken for some time.

For me it looks to be:
pynag-0.9.1-1

Please let me know anything else I can do to help

@gardart
Copy link
Contributor

gardart commented May 23, 2018

@Mjolinir and @tjyang could you try to update to the latest pynag and adagios (released last week), using
yum --enablerepo=ok-testing update pynag adagios
let me know if this solves this issue

@tjyang
Copy link
Author

tjyang commented May 23, 2018

@Mjolinir , I updated the new rpms on my test nagios instance, it didn't help. Can you confirm ?

After "yum --enablerepo=ok-testing update pynag adagios"
[me@nagios03 ~]$ rpm -qa |egrep 'adagio|pynag'
pynag-0.9.1-1.git.187.9bcf9ed.el7.noarch
adagios-1.6.3-2.git.0.4290a53.el7.noarch
[me@ilclnagios03 ~]$

  • Error message in web browser after click on downtime.

Oh no, something went wrong ☹
InvalidResponseFromLivestatus: Could not parse response from livestatus. Query:GET downtimes ResponseHeader: fixed16 OutputFormat: python ColumnHeaders: on Response: 
  • Show debug
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/adagios/views.py", line 43, in wrapper
    result = view_func(request, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/adagios/status/views.py", line 971, in downtime_list
    c['downtimes'] = l.query('GET downtimes', *args)
  File "/usr/lib/python2.7/site-packages/pynag/Parsers/multisite.py", line 80, in query
    query_result = backend_instance.query(query, *args, **kwargs)
  File "/usr/lib/python2.7/site-packages/pynag/Parsers/livestatus.py", line 996, in query
    raise InvalidResponseFromLivestatus(query=livestatus_query, response=response_data)
InvalidResponseFromLivestatus: Could not parse response from livestatus.
Query:GET downtimes
ResponseHeader: fixed16
OutputFormat: python
ColumnHeaders: on

@gardart
Copy link
Contributor

gardart commented May 23, 2018

@tjyang
could you try to add this to your livestatus broker in /etc/nagios/nagios.cfg
debug=1 query_timeout=0

@tjyang
Copy link
Author

tjyang commented May 23, 2018

  • Comments and Downtime under REPORTS section both have same issue.
    image

  • config changed.

[root@nagios03 nagios]# egrep ^broker_module=/usr/lib64/check_mk/livestatus.o  /etc/nagios/nagios.cfg
broker_module=/usr/lib64/check_mk/livestatus.o /var/spool/nagios/cmd/livestatus idle_timeout=12000 num_client_threads=20 debug=1 query_timeout=0
[root@nagios03 nagios]#

  • /var/log/nagios/livestatus.log
[root@inagios03 nagios]# tail -40 /var/log/nagios/livestatus.log
2018-05-23 10:20:55 Query: ResponseHeader: fixed16
2018-05-23 10:20:55 Time to process request: 12 us. Size of answer: 36 bytes
2018-05-23 10:20:56 Query: GET hosts
2018-05-23 10:20:56 Query: Stats: state >= 0
2018-05-23 10:20:56 Query: Stats: state > 0
2018-05-23 10:20:56 Query: Stats: scheduled_downtime_depth = 0
2018-05-23 10:20:56 Query: Stats: hard_state >= 1
2018-05-23 10:20:56 Query: StatsAnd: 3
2018-05-23 10:20:56 Query: Stats: state > 0
2018-05-23 10:20:56 Query: Stats: scheduled_downtime_depth = 0
2018-05-23 10:20:56 Query: Stats: acknowledged = 0
2018-05-23 10:20:56 Query: Stats: hard_state >= 1
2018-05-23 10:20:56 Query: StatsAnd: 4
2018-05-23 10:20:56 Query: Filter: custom_variable_names < _REALNAME
2018-05-23 10:20:56 Query: Localtime: 1527085256
2018-05-23 10:20:56 Query: OutputFormat: python
2018-05-23 10:20:56 Query: KeepAlive: on
2018-05-23 10:20:56 Query: ResponseHeader: fixed16
2018-05-23 10:20:56 Time to process request: 856 us. Size of answer: 13 bytes
2018-05-23 10:20:56 Query: GET services
2018-05-23 10:20:56 Query: Stats: state >= 0
2018-05-23 10:20:56 Query: Stats: state > 0
2018-05-23 10:20:56 Query: Stats: scheduled_downtime_depth = 0
2018-05-23 10:20:56 Query: Stats: host_scheduled_downtime_depth = 0
2018-05-23 10:20:56 Query: Stats: host_state = 0
2018-05-23 10:20:56 Query: Stats: last_hard_state >= 1
2018-05-23 10:20:56 Query: StatsAnd: 5
2018-05-23 10:20:56 Query: Stats: state > 0
2018-05-23 10:20:56 Query: Stats: scheduled_downtime_depth = 0
2018-05-23 10:20:56 Query: Stats: host_scheduled_downtime_depth = 0
2018-05-23 10:20:56 Query: Stats: acknowledged = 0
2018-05-23 10:20:56 Query: Stats: host_state = 0
2018-05-23 10:20:56 Query: Stats: last_hard_state >= 1
2018-05-23 10:20:56 Query: StatsAnd: 6
2018-05-23 10:20:56 Query: Filter: host_custom_variable_names < _REALNAME
2018-05-23 10:20:56 Query: Localtime: 1527085256
2018-05-23 10:20:56 Query: OutputFormat: python
2018-05-23 10:20:56 Query: KeepAlive: on
2018-05-23 10:20:56 Query: ResponseHeader: fixed16
2018-05-23 10:20:56 Time to process request: 7114 us. Size of answer: 18 bytes
[root@nagios03 nagios]#

@Mjolinir
Copy link

Mjolinir commented May 24, 2018

Looks very similar for me:

Updated to the new packages from ok-testing. Problem still exists, unfortunately.

Debug:
Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/adagios/views.py", line 43, in wrapper result = view_func(request, *args, **kwargs) File "/usr/lib/python2.7/site-packages/adagios/status/views.py", line 959, in comment_list c['comments'] = l.query('GET comments', *args) File "/usr/lib/python2.7/site-packages/pynag/Parsers/multisite.py", line 80, in query query_result = backend_instance.query(query, *args, **kwargs) File "/usr/lib/python2.7/site-packages/pynag/Parsers/livestatus.py", line 996, in query raise InvalidResponseFromLivestatus(query=livestatus_query, response=response_data) InvalidResponseFromLivestatus: Could not parse response from livestatus. Query:GET comments ResponseHeader: fixed16 OutputFormat: python ColumnHeaders: on

Error msg:

`InvalidResponseFromLivestatus: Could not parse response from livestatus. Query:GET downtimes ResponseHeader: fixed16 OutputFormat: python ColumnHeaders: on Response: [[u"author",u"comment",u"duration",u"end_time",u"entry_time",u"fixed",u"host_accept_passive_checks",u"host_acknowledged",u"host_acknowledgement_type",u"host_action_url",u"host_action_url_expanded",u"host_active_checks_enabled",u"host_address",u"host_alias",u"host_check_command",u"host_check_command_expanded",u"host_check_flapping_recovery_notification",u"host_check_freshness",u"host_check_interval",u"host_check_options",u"host_check_period",u"host_check_type",u"host_checks_enabled",u"host_childs",u"host_comments",u"host_comments_with_extra_info",u"host_comments_with_info",u"host_contact_groups",u"host_contacts",u"host_current_attempt",u"host_current_notification_number",u"host_custom_variable_names",u"host_custom_variable_values",u"host_custom_variables",u"host_display_name",u"host_downtimes",u"host_downtimes_with_info",u"host_event_handler",u"host_event_handler_enabled",u"host_execution_time",u"host_filename",u"host_first_notification_delay",u"host_flap_detection_enabled",u"host_groups",u"host_hard_state",u"host_has_been_checked",u"host_high_flap_threshold",u"host_icon_image",u"host_icon_image_alt",u"host_icon_image_expanded",u"host_in_check_period",u"host_in_notification_period",u"host_in_service_period",u"host_initial_state",u"host_is_executing",u"host_is_flapping",u"host_last_check",u"host_last_hard_state",u"host_last_hard_state_change",u"host_last_notification",u"host_last_state",u"host_last_state_change",u"host_last_time_down",u"host_last_time_unreachable",u"host_last_time_up",u"host_latency",u"host_long_plugin_output",u"host_low_flap_threshold",u"host_max_check_attempts",u"host_metrics",u"host_mk_inventory",u"host_mk_inventory_gz",u"host_mk_inventory_last",u"host_modified_attributes",u"host_modified_attributes_list",u"host_name",u"host_next_check",u"host_next_notification",u"host_no_more_notifications",u"host_notes",u"host_notes_expanded",u"host_notes_url",u"host_notes_url_expanded",u"host_notification_interval",u"host_notification_period",u"host_notifications_enabled",u"host_num_services",u"host_num_services_crit",u"host_num_services_hard_crit",u"host_num_services_hard_ok",u"host_num_services_hard_unknown",u"host_num_services_hard_warn",u"host_num_services_ok",u"host_num_services_pending",u"host_num_services_unknown",u"host_num_services_warn",u"host_obsess_over_host",u"host_parents",u"host_pending_flex_downtime",u"host_percent_state_change",u"host_perf_data",u"host_plugin_output",u"host_pnpgraph_present",u"host_process_performance_data",u"host_retry_interval",u"host_scheduled_downtime_depth",u"host_service_period",u"host_services",u"host_services_with_fullstate",u"host_services_with_info",u"host_services_with_state",u"host_staleness",u"host_state",u"host_state_type",u"host_statusmap_image",u"host_total_services",u"host_worst_service_hard_state",u"host_worst_service_state",u"host_x_3d",u"host_y_3d",u"host_z_3d",u"id",u"is_service",u"service_accept_passive_checks",u"service_acknowledged",u"service_acknowledgement_type",u"service_action_url",u"service_action_url_expanded",u"service_active_checks_enabled",u"service_cache_interval",u"service_cached_at",u"service_check_command",u"service_check_command_expanded",u"service_check_freshness",u"service_check_interval",u"service_check_options",u"service_check_period",u"service_check_type",u"service_checks_enabled",u"service_comments",u"service_comments_with_extra_info",u"service_comments_with_info",u"service_contact_groups",u"service_contacts",u"service_current_attempt",u"service_current_notification_number",u"service_custom_variable_names",u"service_custom_variable_values",u"service_custom_variables",u"service_description",u"service_display_name",u"service_downtimes",u"service_downtimes_with_info",u"service_event_handler",u"service_event_handler_enabled",u"service_execution_time",u"service_first_notification_delay",u"service_flap_detection_enabled",u"service_groups",u"service_has_been_checked",u"service_high_flap_threshold",u"service_icon_image",u"service_icon_image_alt",u"service_icon_image_expanded",u"service_in_check_period",u"service_in_notification_period",u"service_in_service_period",u"service_initial_state",u"service_is_executing",u"service_is_flapping",u"service_last_check",u"service_last_hard_state",u"service_last_hard_state_change",u"service_last_notification",u"service_last_state",u"service_last_state_change",u"service_last_time_critical",u"service_last_time_ok",u"service_last_time_unknown",u"service_last_time_warning",u"service_latency",u"service_long_plugin_output",u"service_low_flap_threshold",u"service_max_check_attempts",u"service_metrics",u"service_modified_attributes",u"service_modified_attributes_list",u"service_next_check",u"service_next_notification",u"service_no_more_notifications",u"service_notes",u"service_notes_expanded",u"service_notes_url",u"service_notes_url_expanded",u"service_notification_interval",u"service_notification_period",u"service_notifications_enabled",u"service_obsess_over_service",u"service_percent_state_change",u"service_perf_data",u"service_plugin_output",u"service_pnpgraph_present",u"service_process_performance_data",u"service_retry_interval",u"service_scheduled_downtime_depth",u"service_service_period",u"service_staleness",u"service_state",u"service_state_type",u"start_time",u"triggered_by",u"type"]

....

1527163154,0,0,u"",u"",u"",u"",6.0000000000e+01,u"24x7_except_maintenance",1,0,0,0,0,0,0,0,0,0,0,1,[],0,0.0000000000e+00,u"",u"(Host check timed out after 30.10 seconds)",-1,1,1.0000000000e+00,1,u"",[],[],[],[],1.1466666667e+00,1,1,u"",0,0,0,0.0000000000e+00,0.0000000000e+00,0.0000000000e+00,177,0,0,0,0,u"",u"",0,0,0,u"",u"",0,0.0000000000e+00,0,u"",0,0,[],[],[],[],[],0,0,[],[],{},u"",u"",[],[],u"",0,0.0000000000e+00,0.0000000000e+00,0,[],0,0.0000000000e+00,u"",u"",u"",0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0000000000e+00,u"",0.0000000000e+00,0,,0,[],0,0,0,u"",u"",u"",u"",0.0000000000e+00,u"",0,0,0.0000000000e+00,u"",u"",0,0,0.0000000000e+00,0,u"",0.0000000000e+00,0,0,1516214153,0,2]] `

tail -50 /var/log/nagios/livestatus.log
2018-05-24 07:59:25 Query: GET hosts
2018-05-24 07:59:25 Query: ResponseHeader: fixed16
2018-05-24 07:59:25 Query: OutputFormat: python
2018-05-24 07:59:25 Query: ColumnHeaders: on
2018-05-24 07:59:25 Time to process request: 6587 us. Size of answer: 164445 bytes
2018-05-24 07:59:25 Time to process request: 5982 us. Size of answer: 164445 bytes
2018-05-24 07:59:25 Query: GET services
2018-05-24 07:59:25 Query: Filter: state != 0
2018-05-24 07:59:25 Query: Filter: acknowledged = 0
2018-05-24 07:59:25 Query: Filter: host_acknowledged = 0
2018-05-24 07:59:25 Query: Filter: scheduled_downtime_depth = 0
2018-05-24 07:59:25 Query: Filter: host_scheduled_downtime_depth = 0
2018-05-24 07:59:25 Query: Stats: state != 0
2018-05-24 07:59:25 Query: Stats: host_state != 0
2018-05-24 07:59:25 Query: ResponseHeader: fixed16
2018-05-24 07:59:25 Query: OutputFormat: python
2018-05-24 07:59:25 Query: ColumnHeaders: off
2018-05-24 07:59:25 Time to process request: 37 us. Size of answer: 8 bytes
2018-05-24 07:59:25 Query: GET services
2018-05-24 07:59:25 Query: Stats: state != 0
2018-05-24 07:59:25 Query: Stats: state != 0
2018-05-24 07:59:25 Query: Stats: acknowledged = 0
2018-05-24 07:59:25 Query: Stats: scheduled_downtime_depth = 0
2018-05-24 07:59:25 Query: Stats: host_state = 0
2018-05-24 07:59:25 Query: StatsAnd: 4
2018-05-24 07:59:25 Query: ResponseHeader: fixed16
2018-05-24 07:59:25 Query: OutputFormat: python
2018-05-24 07:59:25 Query: ColumnHeaders: off
2018-05-24 07:59:25 Time to process request: 57 us. Size of answer: 8 bytes
2018-05-24 07:59:25 Query: GET hosts
2018-05-24 07:59:25 Query: Stats: state != 0
2018-05-24 07:59:25 Query: Stats: state != 0
2018-05-24 07:59:25 Query: Stats: acknowledged = 0
2018-05-24 07:59:25 Query: Stats: scheduled_downtime_depth = 0
2018-05-24 07:59:25 Query: Stats: host_state = 1
2018-05-24 07:59:25 Query: StatsAnd: 4
2018-05-24 07:59:25 Query: ResponseHeader: fixed16
2018-05-24 07:59:25 Query: OutputFormat: python
2018-05-24 07:59:25 Query: ColumnHeaders: off
2018-05-24 07:59:25 Time to process request: 56 us. Size of answer: 8 bytes
2018-05-24 07:59:25 Query: GET hosts
2018-05-24 07:59:25 Query: ResponseHeader: fixed16
2018-05-24 07:59:25 Query: OutputFormat: python
2018-05-24 07:59:25 Query: ColumnHeaders: on
2018-05-24 07:59:25 Time to process request: 6385 us. Size of answer: 164445 bytes
2018-05-24 07:59:25 Query: GET hosts
2018-05-24 07:59:25 Query: ResponseHeader: fixed16
2018-05-24 07:59:25 Query: OutputFormat: python
2018-05-24 07:59:25 Query: ColumnHeaders: on
2018-05-24 07:59:25 Time to process request: 6306 us. Size of answer: 164445 bytes

@Mjolinir
Copy link

One thing I noticed, not sure if it is relevant,

Im using check-mk-livestatus-1.2.8p26-1.el7 from EPEL.
If I use mk-livestatus-1.2.2-3.git.2.27fc0fd.el7.centos.x86_64 from ok-testing then livestatus does not work at all.

@tjyang
Copy link
Author

tjyang commented May 24, 2018

  • I am using same check-mk-livestatus from EPEL, same as @Mjolinir is using.
[root@nagios03 ~]# rpm -qi check-mk-livestatus-1.2.8p26-1.el7.x86_64
Name        : check-mk-livestatus
Version     : 1.2.8p26
Release     : 1.el7
Architecture: x86_64
Install Date: Sat 27 Jan 2018 03:38:45 PM EST
Group       : Applications/Internet
Size        : 762663
License     : GPLv2 and GPLv3
Signature   : RSA/SHA256, Fri 06 Oct 2017 11:47:35 AM EDT, Key ID 6a2faea2352c64e5
Source RPM  : check-mk-1.2.8p26-1.el7.src.rpm
Build Date  : Fri 06 Oct 2017 11:27:00 AM EDT
Build Host  : buildhw-09.phx2.fedoraproject.org
<snipped>
[root@nagios03 ~]#

@gardart
Copy link
Contributor

gardart commented May 30, 2018

check-mk-livestatus-1.2.8p26-1.el7 from EPEL is the correct one...

@tjyang
Copy link
Author

tjyang commented May 30, 2018

@gardart
I am using check-mk-livestatus-1.2.8p26-1.el7 from EPEL , Comments and Downtime still has issue. Looks like the adagios side of parser code need to be adjusted.

@gardart
Copy link
Contributor

gardart commented Jul 12, 2018

does your nagios server crash when this happens? Do you need to restart nagios service every time?

@tjyang
Copy link
Author

tjyang commented Jul 12, 2018

No, both nagios and livestatus daemon weren't not crashed when this issue happened.

[root@nagios03 nagios]# tail -20f  /var/log/nagios/livestatus.log
2018-07-11 16:32:19 Idle timeout of 12000 ms exceeded. Going to close connection.
2018-07-11 16:32:19 error: Client connection terminated while request still incomplete
2018-07-11 16:32:21 Idle timeout of 12000 ms exceeded. Going to close connection.
2018-07-11 16:32:21 error: Client connection terminated while request still incomplete
2018-07-11 16:32:41 Idle timeout of 12000 ms exceeded. Going to close connection.
2018-07-11 16:32:41 error: Client connection terminated while request still incomplete
2018-07-11 16:32:48 Idle timeout of 12000 ms exceeded. Going to close connection.
2018-07-11 16:32:48 error: Client connection terminated while request still incomplete
2018-07-11 20:01:04 deinitializing
2018-07-11 20:01:04 Waiting for main to terminate...
2018-07-11 20:01:04 Waiting for client threads to terminate...
2018-07-11 20:01:04 Logfile cache: flushing complete cache.
2018-07-12 00:01:03 deinitializing
2018-07-12 00:01:03 Waiting for main to terminate...
2018-07-12 00:01:05 Waiting for client threads to terminate...
2018-07-12 00:01:05 Logfile cache: flushing complete cache.
2018-07-12 04:01:04 deinitializing
2018-07-12 04:01:04 Waiting for main to terminate...
2018-07-12 04:01:06 Waiting for client threads to terminate...
2018-07-12 04:01:06 Logfile cache: flushing complete cache.
^C
[root@nagios03 nagios]# date
Thu Jul 12 07:01:13 EDT 2018
[root@nagios03 nagios]#

@Mjolinir
Copy link

Same applies to me. no crashes.

@Mjolinir
Copy link

I noticed today that both Comments and Downtime are working! Unfortunately I am not sure which update fixed it. Here are the current versions of related packages:

check-mk-livestatus-1.4.0p31-2.el7.x86_64 (last updated June 21)
pynag-0.9.1-1.git.187.9bcf9ed.el7.noarch (last updated May 24)
adagios-1.6.3-2.git.0.4290a53.el7.noarch (last updated May 24)
nagios-4.3.4-5.el7.x86_64 (last updated Apr 16)

It seems likely it was the check-mk-livestatus update in June and I just didn't notice - the updates are automated with Ansible

@tjyang can you confirm on your end?

@tjyang
Copy link
Author

tjyang commented Jul 17, 2018

  • Existing rpms
[me@nagios03 ~]$  rpm -qa |egrep 'check-mk-livestatus-1|pynag-0|adagios-1|nagios-4'
pynag-0.9.1-1.git.187.9bcf9ed.el7.noarch
adagios-1.6.3-2.git.0.4290a53.el7.noarch
nagios-4.3.4-3.el7.x86_64
check-mk-livestatus-1.2.8p26-1.el7.x86_64
[me@nagios03 ~]$
  • update livestatus
sudo  yum update -y check-mk-livestatus
sudo systemctl restart nagios
  • This livestatus fixed the "Comment" and "Downtimes" menu. ;)

Thanks to @Mjolinir's pointer and @gardart's help.

@tjyang tjyang closed this as completed Jul 17, 2018
@tjyang
Copy link
Author

tjyang commented Jul 17, 2018

  • nagios03 above is using CentoOS 7.5.
  • Following nagios01 using CentOS 7.4 and older version of Nagios.
    Comments and Downtime are still not working with same error.
[me@nagios01 servers]$ rpm -qa |egrep 'check-mk-livestatus-1|pynag-0|adagios-1|nagios-4'
pynag-0.9.1-1.git.172.66b2afa.el7.centos.noarch
adagios-1.6.3-1.git.0.fe59eeb.el7.centos.noarch
nagios-4.1.1-2.el7.centos.x86_64
check-mk-livestatus-1.4.0p31-2.el7.x86_64
[me@nagios01 servers]$ cat /etc/redhat-release
CentOS Linux release 7.4.1708 (Core)
[me@nagios01 servers]$
  • So older livestatus is not the only root cause. adagios in nagios01 and nagios03 are different also.

@tjyang tjyang reopened this Jul 17, 2018
@gardart
Copy link
Contributor

gardart commented Oct 24, 2018

I tried two different versions of mk-livestatus, 1.2.6 and 1.2.8.
1.2.6 still works with Nagios4 but 1.2.8 gives parse errors in downtime and comments view.
mk-livestatus works best when using Naemon as the Nagios server. You can install Adagios on top of Naemon as well.

Here is the current workaround for Nagios4:
You can build 1.2.6 with nagios4 like this

yum remove check-mk
wget http://www.mathias-kettner.de/download/mk-livestatus-1.2.6.tar.gz
yum install -y make gcc-c++
tar -zxvf mk-livestatus-1.2.6.tar.gz
cd mk-livestatus-1.2.6
./configure --with-nagios4
make
make install

Then use this in your broker_module settings
broker_module=/usr/local/lib/mk-livestatus/livestatus.o /var/spool/nagios/cmd/livestatus

@gardart gardart self-assigned this Oct 26, 2018
@tjyang
Copy link
Author

tjyang commented Sep 4, 2019

  • Recording another running instances on CentOS 7.6 without showing comment issue.
[root@nagios03 ~]# cat /etc/redhat-release; rpm -qa |egrep 'check-mk-livestatus-1|pynag-0|adagios-1|nagios-4';date
CentOS Linux release 7.6.1810 (Core)
pynag-0.9.1-1.git.187.9bcf9ed.el7.noarch
adagios-1.6.3-2.git.0.4290a53.el7.noarch
check-mk-livestatus-1.4.0p31-2.el7.x86_64
nagios-4.4.3-1.el7.x86_64
Wed Sep  4 15:17:30 EDT 2019
[root@nagios03 ~]#

@tjyang
Copy link
Author

tjyang commented Oct 7, 2019

@gardart
check-mk-livestatus-1.4.0p31-2.el7.x86_64 fixed my comment/downtime display issue but it will crash my nagios server due to livestatus aborted when doing LQL 'GET hosts' command.
I tried compiling version from 1.2.8 up to latest 1.6 , they all crashed nagios server when doing GET hosts.
so I followed your tip above, using version 1.2.6 and now both 'GET hosts' and "comment/downtime" all works. Thanks again for your pointer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants