Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes #28126 - improve dynflow sidekiq service #7127

Merged

Conversation

ezr-ondrej
Copy link
Member

Improves the sidekiq service according to the comments from theforeman/foreman-packaging#4194 and template from https://github.com/mperham/sidekiq/blob/master/examples/systemd/sidekiq.service would love to improve logging, but there my knowledge of our guidlines isn't sufficient.

@theforeman-bot
Copy link
Member

Issues: #28126

WorkingDirectory=/usr/share/foreman
ExecStart=/usr/bin/sidekiq -e ${RAILS_ENV} -r ${DYNFLOW_SIDEKIQ_SCRIPT} -C /etc/foreman/dynflow/%i.yml
ExecReload=/usr/bin/kill -TSTP $MAINPID

SyslogIdentifier=dynflow-sidekiq@%i
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh nice! I looked but never found how to solve that.

ekohl
ekohl previously approved these changes Oct 24, 2019
extras/systemd/dynflow-sidekiq@.service Show resolved Hide resolved
User=foreman
TimeoutSec=90
Environment=RAILS_ENV=production
Environment=DYNFLOW_SIDEKIQ_SCRIPT=/usr/share/foreman/extras/dynflow-sidekiq.rb
# Greatly reduce Ruby memory fragmentation and heap usage
# https://www.mikeperham.com/2018/04/25/taming-rails-memory-bloat/
Environment=MALLOC_ARENA_MAX=2
WorkingDirectory=/usr/share/foreman
ExecStart=/usr/bin/sidekiq -e ${RAILS_ENV} -r ${DYNFLOW_SIDEKIQ_SCRIPT} -C /etc/foreman/dynflow/%i.yml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So one downside to this pattern is that if I give it a name that does not exist, it silently accepts my systemctl command leaving the user with no indication they did something wrong. Here is my example:

I accidentally misspell orchestrator trying to start or restart it:

[root@centos7-katello-nightly vagrant]# systemctl start dynflow-sidekiq@orchestrat
[root@centos7-katello-nightly vagrant]#

The output is silent leaving me with no clear indicator I tried to start something that does not exist. The journal does log it:

Oct 24 16:27:54 centos7-katello-nightly.war.example.com dynflow-sidekiq@orchestrat[28392]: No such file /etc/foreman/dynflow/orchestrat.yml
Oct 24 16:27:54 centos7-katello-nightly.war.example.com dynflow-sidekiq@orchestrat[28392]: /opt/theforeman/tfm/root/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/cli.rb:239:in `setup_options'
Oct 24 16:27:54 centos7-katello-nightly.war.example.com dynflow-sidekiq@orchestrat[28392]: /opt/theforeman/tfm/root/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/cli.rb:31:in `parse'
Oct 24 16:27:54 centos7-katello-nightly.war.example.com dynflow-sidekiq@orchestrat[28392]: /opt/theforeman/tfm/root/usr/share/gems/gems/sidekiq-5.2.7/bin/sidekiq:11:in `<top (required)>'
Oct 24 16:27:54 centos7-katello-nightly.war.example.com dynflow-sidekiq@orchestrat[28392]: /opt/theforeman/tfm/root/usr/bin/sidekiq:23:in `load'
Oct 24 16:27:54 centos7-katello-nightly.war.example.com dynflow-sidekiq@orchestrat[28392]: /opt/theforeman/tfm/root/usr/bin/sidekiq:23:in `<main>'

However, I am more than certain we will get bugs logged with this pattern. I think we should re-think this pattern or add some sort of checking that produces clear output to a user when managing these services that they tried to perform an operation on an unknown configuration.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted, don't know what to do about it, though 🤔 I will try to figure something out. But would like to get this in 1.24 if possible and I believe this is improvement even without it.

@ehelms
Copy link
Member

ehelms commented Oct 24, 2019

If I accidentally start just the orchestrator and forget the worker, and try an action I get:

An error occurred while saving the Product: 0 The time waiting for task e72cbfe8-2cc6-482f-81b5-12105571403c to finish exceeded the 'foreman_tasks_sync_task_timeout' (120s)

This error isn't very clear to me what I should do next as a user. And now I have a task in the tasks page stuck at planned pending with no resume or cancel option.

Copy link
Member

@tbrisker tbrisker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ezr-ondrej, @ehelms and @ekohl !

@tbrisker tbrisker merged commit 0708f7c into theforeman:develop Oct 28, 2019
User=foreman
TimeoutSec=90
Environment=RAILS_ENV=production
Environment=DYNFLOW_SIDEKIQ_SCRIPT=/usr/share/foreman/extras/dynflow-sidekiq.rb
# Greatly reduce Ruby memory fragmentation and heap usage
# https://www.mikeperham.com/2018/04/25/taming-rails-memory-bloat/
Environment=MALLOC_ARENA_MAX=2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At a cost of reducing CPU performance, it's worth noting. Probably not a big deal for dynlflow but it's always good to see some numbers when doing fine tuning like this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the regular dynflow this has been tested extensively by the Satellite performance team and they found it leads to a much more stable installation.

SyslogIdentifier=dynflow-sidekiq@%i

# if we crash, restart
RestartSec=1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question tho - how much time the sidekick/dynflow needs boot up? I think one second is probably too tight, if that's less then once second, this can quickly go out of control.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the time to sleep before restarting. If not specified, it'll instantly restart. Also note that Type=simple means it runs in the foreground. IIRC systemd gives up after a few attempts in quick succession to avoid restart loops.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants