Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

httpcheck do not accept URLs that do not end with com #3656

Closed
drdeimos opened this issue Apr 25, 2018 · 16 comments · Fixed by #3661 or #7601
Closed

httpcheck do not accept URLs that do not end with com #3656

drdeimos opened this issue Apr 25, 2018 · 16 comments · Fixed by #3661 or #7601
Labels

Comments

@drdeimos
Copy link

drdeimos commented Apr 25, 2018

Steps to reproduce:

# cat /etc/netdata/python.d/httpcheck.conf 
google:
  url: 'http://google.com'
youtube:
  url: 'http://youtube.com'

restart netdata and view log:

 /var/log/netdata :( # tail -f error.log|grep httpcheck
2018-04-26 00:15:00: python.d INFO: httpcheck: google: check() => [OK]
2018-04-26 00:15:00: python.d INFO: httpcheck: youtube: check() => [OK]

but if i try add this domains:

google:
  url: 'http://google.com'
youtube:
  url: 'http://youtube.com'
googleru:
  url: 'http://google.ru'
mailru:
  url: 'http://mail.ru'

got errors:

2018-04-26 00:16:52: python.d INFO: httpcheck: google: check() => [OK]
2018-04-26 00:16:52: python.d INFO: httpcheck: youtube: check() => [OK]
2018-04-26 00:16:52: python.d ERROR: httpcheck: googleru: _get_data() failed. Url: http://google.ru. Error: 'ascii' codec can't decode byte 0xf0 in position 43021: ordinal not in range(128)
2018-04-26 00:16:52: python.d INFO: httpcheck: googleru: check() => [FAILED]
2018-04-26 00:16:53: python.d ERROR: httpcheck: mailru: _get_data() failed. Url: http://mail.ru. Error: 'ascii' codec can't decode byte 0xd0 in position 173: ordinal not in range(128)
2018-04-26 00:16:53: python.d INFO: httpcheck: mailru: check() => [FAILED]

Versions:

Your netdata version: 1.10.0-51-gae5cc36_rolling
Your netdata commit: ae5cc36 
$ lsb_release -rc
Release:	16.04
Codename:	xenial
$ python -V
Python 2.7.12
$ python3 -V
Python 3.5.2
@Ferroin
Copy link
Member

Ferroin commented Apr 26, 2018

I don't think this has anything to do with the TLD the URL points to.

Both google.ru and mail.ru are in Russian (as obviously implied by the .ru TLD) encoded in UTF-8. The byte indexes those error messages reference are the first times I can find in both pages where a Unicode character with a codepoint above 127 appears, and the decoder being used by httpcheck is therefore choking on the bytes above 127. I can produce similar error messages by pointing it at other sites where the main content is not in English, even if they are in a .com or .net TLD.

I think it's pretty likely that this could be fixed by just updating things to detect the encoding from the Content-Type header, but I defer to @ccremer on that as he's the one who wrote the plugin and therefore probably has a far better understanding of what it's doing than I do.

@ccremer
Copy link
Contributor

ccremer commented Apr 26, 2018

The problem most likely lies in the UrlService, as #3645 tries to fix (or did fix) #3641 . Personally I think the bug has just moved, since I had issues with the decode() method before (that's why I did't use decode() in the else block.

Can you find and edit UrlService.py somewhere in /usr/libexec/netdata/ and remove the decode() method? somewhere round line 100, e.g.

# from...
return response.status, response.data.decode()
# to
return response.status, response.data

I currently don't have a running netdata instance at hand...

@drdeimos
Copy link
Author

Change file
/usr/src/netdata.git/python.d/python_modules/bases/FrameworkServices/UrlService.py
line 103

# from...
return response.status, response.data.decode()
# to
return response.status, response.data

and restart netdata still leads to errors in the log:

# tail -f /var/log/netdata/error.log|grep httpcheck
2018-04-26 21:13:38: python.d INFO: httpcheck: google: check() => [OK]
2018-04-26 21:13:38: python.d INFO: httpcheck: youtube: check() => [OK]
2018-04-26 21:13:38: python.d ERROR: httpcheck: googleru: _get_data() failed. Url: http://google.ru. Error: 'ascii' codec can't decode byte 0xf0 in position 43028: ordinal not in range(128)
2018-04-26 21:13:38: python.d INFO: httpcheck: googleru: check() => [FAILED]
2018-04-26 21:13:39: python.d ERROR: httpcheck: mailru: _get_data() failed. Url: http://mail.ru. Error: 'ascii' codec can't decode byte 0xd0 in position 173: ordinal not in range(128)
2018-04-26 21:13:39: python.d INFO: httpcheck: mailru: check() => [FAILED]

@ccremer
Copy link
Contributor

ccremer commented Apr 26, 2018

Well, that's the source file. Did you actually run the re-install script? Because the python source file usually gets installed in /usr/libexec/netdata/python.d/python_modules/bases/FrameworkServices/UrlService.py and netdata service picks that one. So either edit source file and run the update, or edit the /usr/libexec one directly.

How productive is your machine? It's possible that this "little change" might break other plugins that rely on the UrlService. To be on the safe side, please use a test machine.

@drdeimos
Copy link
Author

Oh, sorry.
Now I'm edit UrlService.py and run /usr/src/netdata.git/netdata-installer.sh after it.
Uncomment .ru domains and restart netdata:

2018-04-26 23:03:59: python.d INFO: httpcheck: google: check() => [OK]
2018-04-26 23:03:59: python.d INFO: httpcheck: youtube: check() => [OK]
2018-04-26 23:03:59: python.d INFO: httpcheck: googleru: check() => [OK]
2018-04-26 23:04:00: python.d INFO: httpcheck: mailru: check() => [OK]
2018-04-26 23:04:00: netdata INFO  : PLUGINSD[python.d] : Initializing file /var/cache/netdata/netdata.runtime_httpcheck_googleru/main.db.
2018-04-26 23:04:00: netdata INFO  : PLUGINSD[python.d] : Initializing file /var/cache/netdata/netdata.runtime_httpcheck_googleru/run_time.db.
2018-04-26 23:04:00: netdata INFO  : PLUGINSD[python.d] : Initializing file /var/cache/netdata/netdata.runtime_httpcheck_mailru/main.db.
2018-04-26 23:04:00: netdata INFO  : PLUGINSD[python.d] : Initializing file /var/cache/netdata/netdata.runtime_httpcheck_mailru/run_time.db.

Now looks good. I see all healthchecks in netdata web-interface.

@ccremer
Copy link
Contributor

ccremer commented Apr 26, 2018

Glad to see that :)

@l2isbad What's the reason for the decode() method? AFAIK it's only brought problems...

@ilyam8
Copy link
Member

ilyam8 commented Apr 26, 2018

@ccremer py3 response.data is bytes, not string. We have to add encoding=... or errors="ignore" or both to decode (Ideally, there should be a method that determines the encoding)

https://eli.thegreenplace.net/2012/01/30/the-bytesstr-dichotomy-in-python-3

@ilyam8
Copy link
Member

ilyam8 commented Apr 26, 2018

As a quick fix we can only decode if not isinstance(response.data, str)

@ilyam8 ilyam8 mentioned this issue Apr 26, 2018
@ilyam8
Copy link
Member

ilyam8 commented Apr 26, 2018

actually http://google.ru decode fails even with utf-8

@ktsaou
Copy link
Member

ktsaou commented May 1, 2018

actually http://google.ru decode fails even with utf-8

so, is this still buggy?

@drdeimos
Copy link
Author

drdeimos commented May 1, 2018

My problem doesn't reproduce anymore after latest updates. Thx.

@a-camacho
Copy link

a-camacho commented Dec 20, 2019

Hi guys,
I'm having this problem with some URL's.
Trying with http://google.ru still gives me same error. Anyone else ?

netdata@dev:/etc/netdata$ lsb_release -rc
Release:	10
Codename:	buster
netdata@dev:/etc/netdata$ python -V
Python 2.7.16
netdata@dev:/etc/netdata$ python3 -V
Python 3.7.3

And my log is :

2019-12-20 15:53:00: python.d DEBUG: httpcheck[w3] : update => [OK] (elapsed time: 23, failed retries in a row: 0)
2019-12-20 15:53:00: python.d DEBUG: httpcheck[googleru] : Url: https://www.google.ru/. Host responded with status code 200 in 0.0651650428772 s
2019-12-20 15:53:00: python.d ERROR: httpcheck[googleru] : _get_data() failed. Url: https://www.google.ru/. Error: 'ascii' codec can't decode byte 0xf0 in position 44803: ordinal not in range(128)
2019-12-20 15:53:00: python.d ERROR: httpcheck[googleru] : Traceback (most recent call last):
  File "/usr/libexec/netdata/python.d/python_modules/bases/FrameworkServices/UrlService.py", line 171, in check
    data = self._get_data()
  File "/usr/libexec/netdata/python.d/httpcheck.chart.py", line 94, in _get_data
    self.process_response(content, data, status)
  File "/usr/libexec/netdata/python.d/httpcheck.chart.py", line 116, in process_response
    self.debug('Content: \n\n{content}\n'.format(content=content))
  File "/usr/libexec/netdata/python.d/python_modules/bases/loggers.py", line 166, in debug
    'job_name': self.job_name or self.module_name})
  File "/usr/libexec/netdata/python.d/python_modules/bases/loggers.py", line 124, in debug
    self.logger.debug(' '.join(map(unicode_str, msg)), **kwargs)
  File "/usr/libexec/netdata/python.d/python_modules/bases/collection.py", line 98, in unicode_str
    return unicode(arg)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 in position 44803: ordinal not in range(128)

2019-12-20 15:53:00: python.d INFO: plugin[main] : httpcheck[googleru] : check failed

@ilyam8
Copy link
Member

ilyam8 commented Dec 20, 2019

a quick fix is to add -ppython3 to the command options. See [plugin:python.d] section in the netdata.conf

i added a fix in #7601

@a-camacho
Copy link

I can confirm that your fixes workes for all my URL's.
The quick fix didn't work for some websites, but worked for google.ru.

@ilyam8
Copy link
Member

ilyam8 commented Dec 23, 2019

@a-camacho the fix is merged

@a-camacho
Copy link

Thanks !

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
6 participants