Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed install/remove of telegraf on non-systemd Debian/Ubuntu systems #2360

Merged
merged 8 commits into from
Apr 20, 2017
Merged

Fixed install/remove of telegraf on non-systemd Debian/Ubuntu systems #2360

merged 8 commits into from
Apr 20, 2017

Conversation

martinseener
Copy link
Contributor

@martinseener martinseener commented Feb 2, 2017

When having systemd installed but not actively used (so systemd is not PID 1 eg. not running) we have to assume that we run sysv instead. So i’ve changed the check wether systemd is installed (which systemctl ex. on line 63 post-install.sh) to wether systemd is actually running (pidof systemd). If it’s not running we can safely fallback to the sysv method for install/remove because pidof systemd will exit 1 of systemd is not running. This also fixes problems when running Debian Jessie with package sysvinit-core installed and used instead of systemd!

I’ve also optimzed the bash scripts a bit, so for example the return codes are not checked after a command has been run (line 28 post-install.sh) but the actual return code is actively checked (if <command>; then instead of <command>; if [ $? -eq 0 ]; then - for rationale see: https://github.com/koalaman/shellcheck/wiki/SC2181)

Also changed the remove/purge OR (line 30 post-remove.sh) see: https://github.com/koalaman/shellcheck/wiki/SC2166 and made some minor indentation fixes

Required for all PRs:

  • CHANGELOG.md updated (we recommend not updating this until the PR has been approved by a maintainer)
  • Sign CLA (if not already signed)
  • README.md updated (if adding a new plugin)

@martinseener
Copy link
Contributor Author

martinseener commented Feb 2, 2017

@sparrc don't know what exactly fails in circle but an earlier build succeeded in my fork (only changed the commit message in between). can we do a quick release of telegraf 1.2.2 with this fix because we encountered this fixed issue with systemd while trying to deploy telegraf on some newer machines and we want to continue - but we need a new release with that fix!? that would be awesome!

haven't updated the CHANGELOG in hope of an earlier release ;)

@sparrc
Copy link
Contributor

sparrc commented Feb 2, 2017

sorry but I can't do one-off releases on-demand, you'll have to build it yourself.

I think this change seems OK but you'll need to create corresponding PRs in the influxdb and kapacitor projects before I can merge it.

cc @rossmcdonald

@martinseener
Copy link
Contributor Author

martinseener commented Feb 2, 2017

okay no problem. will do the corresponding PR's in the other two projects.
is it fine to add a changelog entry for 1.3.0 in this case now? (and for the other projects as well?)

Rebased right now and made some more enhancements to the pre-install and post-remove scripts w. code copied from influxdb which is better

For more clarification on the Debian Jessie w. SysVinit:

root@machine:# which systemctl
/bin/systemctl
root@machine:# pidof systemd
root@machine:# echo $?
1
root@machine:# pidof init
1
root@machine:# echo $?
0
root@machine:# cat /etc/debian_version
8.6

@martinseener
Copy link
Contributor Author

@sparrc @rossmcdonald i've made some more enhancements after comparing the scripts from telegraf, influxdb and kapacitor and enhanced all scripts of all three projects accordingly. i've referenced this PR in the other two PR too, so more information is available about the what and why. I hope it's fine for you.

As soon as you tell me that i can add the PR as a link to the changelog (for 1.3.0?), ill do another rebase on all three projects accordingly. If there are any more questions, just ask!

Btw: i've tested those changes manually locally as well on a Debian Jessie system with sysvinit-core installed and running (and systemd installed but not running). See last comment.

PR on InfluxDB: influxdata/influxdb#7933
PR on Kapacitor: influxdata/kapacitor#1161

which systemctl &>/dev/null
if [[ $? -eq 0 ]]; then
install_systemd
if pidof systemd &>/dev/null; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably check for pid 1 instead. systemd could be running on the system, but not be the init manager. Such as if it's running within a container.
A safe test would be if [[ "$(readlink /proc/1/exe)" == */systemd ]]; then

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried that. Indeed it's better and i didn't knew before that systemd can also run but not be the init system. will change the code accordingly ;)

fi
elif [ "$1" == "remove" -o "$1" == "purge" ]; then
elif [[ -f /etc/lsb-release ]]; then
# Debian/Ubuntu logic
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come in post-install.sh, the debian/ubuntu logic is controlled by a test for /etc/debian_version, but here it's a test for /etc/lsb-release?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good Point. I've mostly copied code from influxdb/kapacitor scripts yesterday. I've fixed this now in the next rebased push and will also fix all that stuff in the other PR's of InfluxDB/Kapacitor

grep "^telegraf:" /etc/group &>/dev/null
if [[ $? -ne 0 ]]; then
if ! id telegraf &>/dev/null; then
if grep "^telegraf:" /etc/group &>/dev/null; then
Copy link
Contributor

@phemmer phemmer Feb 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reverses the logic from the previous code. Previous code evaluated as true if the match WASN'T found. This code evaluates as true if the match WAS found.

Copy link
Contributor Author

@martinseener martinseener Feb 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh no....the code is right!! See, before: you execute "id telegraf" and if the user exists already, the return code is 0. The if statement will ONLY be executed if the exit code is NOT 0 eg. the user DOES NOT exist. For my code part it reads like this. If there is a user, negate the return code to non-zero and therefore the if statement is NOT executed because it only has to be executed if there is NO user named telegraf. So "id telegraf" returns NON-0 if there is no such user, the ! negates it to TRUE and the if block is executed, because there is no telegraf user. It's a bit weird but the code is correct ;)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@phemmer was probably talking about the grep telegraf from /etc/group.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohhh sure....Yeah there's a ! missing in front of the grep like one line above. will fix this immediately!

@sparrc
Copy link
Contributor

sparrc commented Feb 3, 2017

Does anyone have the resources to test this? I'm a bit hesitant to merge this because init/install changes always seem to cause problems. I haven't heard of any users having problems with the current scripts until this PR, which makes me think this change is only for supporting an uncommon setup.

@martinseener why don't you just use systemd?

@martinseener
Copy link
Contributor Author

@sparrc Can you tell me how i can build the deb out of this PR so i'll happily test this with a final .deb package. But i've already tested this by modifying the scripts from the recent package and it works.

Well "uncommon"....maybe. Systemd is indeed the "default" in Debian Jessie and onwards BUT it's not that uncommon to continue using SysVinit with Jessie like an official debian wiki document points out (https://wiki.debian.org/systemd#Installing_without_systemd) There you can see how to use SysVinit instead of Systemd right from the installation. We use this throughout our Jessie Systems as we have several problems with Systemd. (One for ex. booting with PXE is not working here and we have tools that doesnt work with Systemd yet - eg. legacy tools used in companys) These are valid points why people still need to use SysVinit.

The other point is that the scripts used at the moment are easily not well written - it's basically "bad" code as the checks used for "systemd"-usage with which systemctl are bad like i pointed out in my first commit description. I just patch that with good code - or let's say - better code for correctly checking if systemd or sysvinit is used.

And people do have problems like you can see here on InfluxDB (Problem is old kernel using sysvinit on a recent ubuntu): influxdata/influxdb#7899

And the last point: This code actually doesn't break anything (i'll happily test this if you tell me how to build the deb from this PR) but it makes the code less error-prone on non-uniform systems. It basically adds support for non-uniform systems ;)

@sparrc
Copy link
Contributor

sparrc commented Feb 3, 2017

I just patch that with good code - or let's say - better code for correctly checking if systemd or sysvinit is used.

could you show me other systems that use the same checks in their install scripts? I'm assuming we got which systemctl from somewhere, so I'd like to see if there are examples or documentation that support the code that you're proposing.

And the last point: This code actually doesn't break anything

famous last words ;-)

@martinseener
Copy link
Contributor Author

martinseener commented Feb 3, 2017

@sparrc Ok, for example. This is actually a Debian Jessie System with SysVinit installed like described in the debian.org wiki article (it's actually one of our staging systems in our company). When trying to remove telegraf from the latest official release it looks like this (because it can find systemctl but systemd is not running, it fails to remove!):

root@workaholic1-21-101:~# apt-get remove telegraf
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages will be REMOVED:
  telegraf
0 upgraded, 0 newly installed, 1 to remove and 95 not upgraded.
After this operation, 32.6 MB disk space will be freed.
Do you want to continue? [Y/n]
(Reading database ... 48687 files and directories currently installed.)
Removing telegraf (1.2.0-1) ...
Failed to get D-Bus connection: Unknown error -1
dpkg: error processing package telegraf (--remove):
  subprocess installed pre-removal script returned error exit status 1
Synchronizing state for telegraf.service with sysvinit using update-rc.d...
Executing /usr/sbin/update-rc.d telegraf defaults
Executing /usr/sbin/update-rc.d telegraf enable
Failed to get D-Bus connection: Unknown error -1
Failed to get D-Bus connection: Unknown error -1
WARNING: systemd not running.
Errors were encountered while processing:
 telegraf
E: Sub-process /usr/bin/dpkg returned an error code (1)

The actual script which is run here can be found in /var/lib/dpkg/info/telegraf.prerm and is actually scripts/pre-remove.sh. Modifying this with my code and also modifying the /var/lib/dpkg/info/telegraf.postrm which is actually scripts/post-remove.sh also with the new code results in this output on the same machine:

root@workaholic1-21-101:~# apt-get remove telegraf
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages will be REMOVED:
  telegraf
0 upgraded, 0 newly installed, 1 to remove and 95 not upgraded.
After this operation, 32.6 MB disk space will be freed.
Do you want to continue? [Y/n]
(Reading database ... 48687 files and directories currently installed.)
Removing telegraf (1.2.0-1) ...
Stopping telegraf..
Stopped
root@workaholic1-21-101:~# apt-get remove telegraf
Reading package lists... Done
Building dependency tree
Reading state information... Done
Package 'telegraf' is not installed, so not removed
0 upgraded, 0 newly installed, 0 to remove and 95 not upgraded.

You see, my script makes it possible to successfully remove telegraf on non-uniform debians again.
Is this enough evidence or do i have to test more for you? ;)

If you tell me how to build the deb with my scripts ill also test the installation and upgrade procedure for you but iam pretty sure it works too!

@sparrc
Copy link
Contributor

sparrc commented Feb 3, 2017

@martinseener I meant more that I'd like someone who didn't write the PR to test it.

I'm sure that it works for you, but we get bit with "works on my machine" bugs all the time, especially related to install/setup.

@martinseener
Copy link
Contributor Author

martinseener commented Feb 3, 2017

@sparrc So please tell me how i can build a new DEB Package out of this PR and i'll test it on a proper Jessie w. Systemd, without Systemd, Ubuntu with and without for you as well.

I don't know who could possibly test my PR besides you and me. What do we do if we don't get a volunteer on this?

BTW: The Circle-CI volunteer says, it works but i don't know if it tests those scripts ;)

I'm trying to modify the latest deb by myself now - should be easier than building a new one. Then ill test this deb on some more machines and report back.

@martinseener
Copy link
Contributor Author

martinseener commented Feb 3, 2017

@sparrc So i've built a debian package out of my branch with the changed scripts and called it telegraf-1.3.0 for this moment. I've attached it here, so everyone can test it on their own. Then i have tested it on Debian Jessie 8.8 with SysVinit without any errors. Install/Remove and Remove with Purge ran just fine. I've also tested Debian Jessie with systemd and Ubuntu 16.04.1 LTS with Systemd. I hope these are just enough tested systems. Tell me if you need more systems - and maybe which one. Below are the results from those 3 systems. Hope those can convince you that my changes are good and remove hassles from installing telegraf of non-conform systems like ours ;-)

Results from our staging machine (Jessie/SysVinit) first (check OS and init system + install telegraf + remove telegraf + install again + remove with purge + try start + check processes):

root@workaholic1-21-101:~# readlink /proc/1/exe
/sbin/init
root@workaholic1-21-101:~# cat /etc/debian_version
8.6
root@workaholic1-21-101:~# dpkg -i /home/martinseener/telegraf_1.3.0~0c3c921_amd64.deb
Selecting previously unselected package telegraf.
(Reading database ... 48677 files and directories currently installed.)
Preparing to unpack .../telegraf_1.3.0~0c3c921_amd64.deb ...
Unpacking telegraf (1.3.0~0c3c921-0) ...
Setting up telegraf (1.3.0~0c3c921-0) ...
telegraf process is not running [ FAILED ]
Starting the process telegraf [ OK ]
telegraf process was started [ OK ]
root@workaholic1-21-101:~# service telegraf status
telegraf Process is running [ OK ]
root@workaholic1-21-101:~# apt-get remove telegraf
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages will be REMOVED:
  telegraf
0 upgraded, 0 newly installed, 1 to remove and 95 not upgraded.
After this operation, 35.5 MB disk space will be freed.
Do you want to continue? [Y/n]
(Reading database ... 48687 files and directories currently installed.)
Removing telegraf (1.3.0~0c3c921-0) ...
telegraf process was stopped [ OK ]
root@workaholic1-21-101:~# dpkg -i /home/martinseener/telegraf_1.3.0~0c3c921_amd64.deb
Selecting previously unselected package telegraf.
(Reading database ... 48677 files and directories currently installed.)
Preparing to unpack .../telegraf_1.3.0~0c3c921_amd64.deb ...
Unpacking telegraf (1.3.0~0c3c921-0) ...
Setting up telegraf (1.3.0~0c3c921-0) ...
telegraf process is not running [ FAILED ]
Starting the process telegraf [ OK ]
telegraf process was started [ OK ]
root@workaholic1-21-101:~# apt-get remove --purge telegraf
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages will be REMOVED:
  telegraf*
0 upgraded, 0 newly installed, 1 to remove and 95 not upgraded.
After this operation, 35.5 MB disk space will be freed.
Do you want to continue? [Y/n] y
(Reading database ... 48687 files and directories currently installed.)
Removing telegraf (1.3.0~0c3c921-0) ...
telegraf process was stopped [ OK ]
Purging configuration files for telegraf (1.3.0~0c3c921-0) ...
dpkg: warning: while removing telegraf, directory '/var/log/telegraf' not empty so not removed
dpkg: warning: while removing telegraf, directory '/etc/telegraf' not empty so not removed
root@workaholic1-21-101:~# /etc/init.d/telegraf start
-bash: /etc/init.d/telegraf: No such file or directory
root@workaholic1-21-101:~# ps aux | grep telegraf
root     18690  0.0  0.0  11128  1036 pts/1    R+   15:10   0:00 grep --color=auto telegraf

That's it for our Jessie with SysVinit. Now Ubuntu 16.04.1 LTS with systemd:

root@ubuntu:~# readlink /proc/1/exe
/lib/systemd/systemd
root@ubuntu:~# cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=16.04
DISTRIB_CODENAME=xenial
DISTRIB_DESCRIPTION="Ubuntu 16.04.1 LTS"
root@ubuntu:~# dpkg -i /home/martin/telegraf_1.3.0~0c3c921_amd64.deb
Selecting previously unselected package telegraf.
(Reading database ... 59643 files and directories currently installed.)
Preparing to unpack .../telegraf_1.3.0~0c3c921_amd64.deb ...
Unpacking telegraf (1.3.0~0c3c921-0) ...
Setting up telegraf (1.3.0~0c3c921-0) ...
Created symlink from /etc/systemd/system/multi-user.target.wants/telegraf.service to /lib/systemd/system/telegraf.service.
root@ubuntu:~# service telegraf status # For shorter output i've redacted the log lines)
● telegraf.service - The plugin-driven server agent for reporting metrics into InfluxDB
	Loaded: loaded (/lib/systemd/system/telegraf.service; enabled; vendor preset: enabled)
	Active: active (running) since Fri 2017-02-03 15:14:37 CET; 16s ago
 	Docs: https://github.com/influxdata/telegraf
	Main PID: 2662 (telegraf)
	Tasks: 8
	Memory: 8.1M
  	CPU: 110ms
	CGroup: /system.slice/telegraf.service
       		└─2662 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
root@ubuntu:~# apt-get remove telegraf
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages will be REMOVED:
	telegraf*
0 upgraded, 0 newly installed, 1 to remove and 126 not upgraded.
After this operation, 35.5 MB disk space will be freed.
Do you want to continue? [Y/n]
(Reading database ... 59654 files and directories currently installed.)
Removing telegraf (1.3.0~0c3c921-0) ...
Removed symlink /etc/systemd/system/multi-user.target.wants/telegraf.service.
root@ubuntu:~# systemctl start telegraf
Failed to start telegraf.service: Unit telegraf.service not found.
root@ubuntu:~# ps aux | grep telegraf
root      4021  0.0  0.0  15744   932 pts/0    S+   15:26   0:00 grep --color=auto telegraf

Also Debian Jessie WITH Systemd (vanilla 8.7 netboot install):

root@debian:/home/martin# cat /etc/debian_version
8.7
root@debian:/home/martin# readlink /proc/1/exe
/lib/systemd/systemd
root@debian:/home/martin# dpkg -i telegraf_1.3.0~0c3c921_amd64.deb
Selecting previously unselected package telegraf.
(Reading database ... 30760 files and directories currently installed.)
Preparing to unpack telegraf_1.3.0~0c3c921_amd64.deb ...
Unpacking telegraf (1.3.0~0c3c921-0) ...
Setting up telegraf (1.3.0~0c3c921-0) ...
Created symlink from /etc/systemd/system/multi-user.target.wants/telegraf.service to /lib/systemd/system/telegraf.service.
root@debian:/home/martin# systemctl status telegraf
	● telegraf.service - The plugin-driven server agent for reporting metrics into InfluxDB
	Loaded: loaded (/lib/systemd/system/telegraf.service; enabled)
	Active: active (running) since Fri 2017-02-03 15:38:52 CET; 13s ago
 	Docs: https://github.com/influxdata/telegraf
	Main PID: 849 (telegraf)
	CGroup: /system.slice/telegraf.service
       		└─849 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
root@debian:/home/martin# apt-get remove telegraf
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages will be REMOVED:
	telegraf
0 upgraded, 0 newly installed, 1 to remove and 0 not upgraded.
After this operation, 35.5 MB disk space will be freed.
Do you want to continue? [Y/n]
(Reading database ... 30771 files and directories currently installed.)
Removing telegraf (1.3.0~0c3c921-0) ...
Removed symlink /etc/systemd/system/multi-user.target.wants/telegraf.service.
root@debian:/home/martin# systemctl start telegraf
Failed to start telegraf.service: Unit telegraf.service failed to load: No such file or directory.
root@debian:/home/martin# ps aux | grep telegraf
root       969  0.0  0.2  12700  2116 pts/0    S+   15:40   0:00 grep telegraf

telegraf_1.3.0~0c3c921_amd64.deb.zip

When having systemd installed but not actively used (so systemd is not PID 1 eg. not running) we have to assume that we run sysv instead. So i’ve changed the check wether systemd is installed (`which systemctl` ex. on line 63 post-install.sh) to wether systemd is actually running (`pidof systemd`). If it’s not running we can safely fallback to the sysv method for install/remove because `pidof systemd` will exit 1 of systemd is not running. This also fixes problems when running Debian Jessie with package `sysvinit-core` installed and used instead of systemd!

I’ve also optimzed the bash scripts a bit, so for example the return codes are not checked after a command has been run (line 28 post-install.sh) but the actual return code is actively checked (`if <command>; then` instead of `<command>; if [ $? -eq 0 ]; then` - for rationale see: https://github.com/koalaman/shellcheck/wiki/SC2181)

Also changed the remove/purge OR (line 30 post-remove.sh) see: https://github.com/koalaman/shellcheck/wiki/SC2166 and made some minor indentation fixes
@martinseener
Copy link
Contributor Author

martinseener commented Feb 3, 2017

There was actually indeed another bug which a colleague with CentOS stumbled upon while installing my branch as an rpm. the install_init part was missing for non-update-rc.d systems, so i moved that up in the code that it works for update-rc.d and chkconfig systems - eg. redhat/centos or all non-debian/ubuntu. Also this colleague peer reviewed it internally now (who build the rpm) and haven't found any more issues so it should be safe to merge at this point as it works on debian/ubuntu just fine as well as CentOS. If you wish i can also ask my colleague to send over the tests from his CentOS machine. For now i attached the new telegraf debian package which has now been tagged 1.2.1.1 so we can install it safely and update later to the next official release of telegraf which hopefully includes this fix 👍
telegraf_1.2.1.1~0c3c921_amd64.deb.zip

@martinseener
Copy link
Contributor Author

martinseener commented Feb 3, 2017

@sparrc See last comment please. And here is also the test from CentOS 7 with systemd and the rpm build from this branch (rpm attached). This also works as expected ;) I really hope this is enough evidence that 1) my code works on several different systems 2) fixes the systemd/sysvinit handling as explained in my first commit 👍

[root@localhost telegraf]# rpm -i build/telegraf-1.2.1.1~694955c.x86_64.rpm
Created symlink from /etc/systemd/system/multi-user.target.wants/telegraf.service to /usr/lib/systemd/system/telegraf.service.
[root@localhost telegraf]# yum remove telegraf
Failed to set locale, defaulting to C
Geladene Plugins: fastestmirror
Abh?ngigkeiten werden aufgel?st
--> Transaktionspr?fung wird ausgef?hrt
---> Paket telegraf.x86_64 0:1.2.1.1~694955c-0 markiert, um gel?scht zu werden
--> Abh?ngigkeitsaufl?sung beendet

Abh?ngigkeiten aufgel?st

=============================================================================================================================================================================
 Package                                Arch                                 Version                                            Paketquelle                             Gr??e
==============================================================================================================================================================================
Entfernen:
 telegraf                               x86_64                               1.2.1.1~694955c-0                                  installed                                34 M

Transaktions?bersicht
==============================================================================================================================================================================
Entfernen  1 Paket

Installationsgr??e: 34 M
Ist dies in Ordnung? [j/N] :j
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Warnung: RPMDB wurde außerhalb von yum verändert.
  L?schen          : telegraf-1.2.1.1~694955c-0.x86_64                                                                                                                    1/1
Removed symlink /etc/systemd/system/multi-user.target.wants/telegraf.service.
Überprüfung läuft: telegraf-1.2.1.1~694955c-0.x86_64                                                                                                                    1/1

Entfernt:
  telegraf.x86_64 0:1.2.1.1~694955c-0

Komplett!
[root@localhost telegraf]# systemctl status telegraf
Unit telegraf.service could not be found.

telegraf-1.2.1.1~694955c.x86_64.rpm.zip

And here is someone else who encountered this issue (with influxdb): influxdata/influxdb#7899
I've also fixed it there, as soon as it's merged (same for kapacitor)

martinseener added a commit to martinseener/influxdb that referenced this pull request Feb 5, 2017
@sparrc sparrc added this to the 1.3.0 milestone Feb 14, 2017
install_init
# Run update-rc.d or fallback to chkconfig if not available
if which update-rc.d &>/dev/null; then
install_update_rcd
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we try this? I expect redhat would never have these scripts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we don't know on which Debian-Derivate the .deb package is being executed on later and so you don't know if it uses update-rc.d or chkconfig in advance. this is why we need both (so this .deb package can be run on as most distros as possible!)

if which update-rc.d &>/dev/null; then
install_update_rcd
else
install_chkconfig
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would we try this on debian? It looks like debian does have chkconfig, but the last version it was packaged for was jessie.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above. You shouldn't depend on users only using Debian or Ubuntu. There are a ton of derivates which can use either update-rc.d or chkconfig, so this script should be as flexible as possible. what's you real problem here? i made it really flexible and tested it on several distributions (debian and non-debian) that it works. and it does perfectly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so if we want to add support for chkconfig here we should also support it in the post-remove script.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're totally right, it's missing there. I'll rework the script and add support for it. Should be an easy quick fix.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danielnelson chkconfig support for debian based systems in post-remove.sh has now been added

# Debian/Ubuntu logic
# Remove/purge
rm -f /etc/default/telegraf
if [[ "$1" != "upgrade" ]]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should keep the same logic as before (remove or purge), because there are other ways this script can be called such as failed-upgrade, or abort-install.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed it back to the old method as i totally agree with you!

because there can be other options like abort-install
@danielnelson danielnelson merged commit 748ca7d into influxdata:master Apr 20, 2017
nathanielc pushed a commit to influxdata/kapacitor that referenced this pull request Nov 8, 2017
Also added distribution-specific install/remove logic with code copied from telegraf/influxdb scripts. For more rationale, see: influxdata/telegraf#2360
fdhex pushed a commit to fdhex/kapacitor that referenced this pull request Jun 12, 2018
Also added distribution-specific install/remove logic with code copied from telegraf/influxdb scripts. For more rationale, see: influxdata/telegraf#2360
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants