Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows: fix service termination #18916

Merged
merged 3 commits into from
Jun 5, 2020
Merged

Conversation

adriansr
Copy link
Contributor

@adriansr adriansr commented Jun 2, 2020

What does this PR do?

Update the Windows service handling logic so that the service doesn't transition to the STOPPED state until the beater is terminated. Right now it transitions just after receiving the stop signal. When restarted, this means that a new Beat process is run while the previous is terminating.

Why is it important?

Since #14069 was merged, now Beats randomly fail restarting under Windows, when run as a service. This isn't caused by the previous PR, but a long standing issue with how the service state is handled.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Testing

To reproduce the bug this fixes, you just have to restart a Beats service.

PS> restart-service winlogbeat

It will fail because the already running service transitions to STOPPED while it still terminating. A new service will be executed while the data dir is still locked by the terminating Beat.

This is easy to reproduce with Winlogbeat with default config, maybe not so easy with other Beats as it depends on how long it takes to terminate the running service.

Related issues

Fixes #18914

Update the Windows service handling logic so that the service doesn't
transition to the STOPPED state until the beater is terminated.

Fixes elastic#18914
@elasticmachine
Copy link
Collaborator

Pinging @elastic/siem (Team:SIEM)

@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels Jun 2, 2020
@elasticmachine
Copy link
Collaborator

elasticmachine commented Jun 2, 2020

💚 Build Succeeded

Pipeline View Test View Changes Artifacts preview

Expand to view the summary

Build stats

  • Build Cause: [Started by user Adrian Serrano, Replayed #5]

  • Start Time: 2020-06-04T20:22:25.637+0000

  • Duration: 78 min 23 sec

Test stats 🧪

Test Results
Failed 0
Passed 9305
Skipped 1574
Total 10879

@adriansr adriansr requested a review from andrewkroh June 3, 2020 17:53
Copy link
Member

@andrewkroh andrewkroh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I wasn't aware of that path.data locking feature.

@andrewkroh
Copy link
Member

jenkins, run tests

@adriansr adriansr merged commit f3ab7c7 into elastic:master Jun 5, 2020
adriansr added a commit to adriansr/beats that referenced this pull request Jun 5, 2020
Update the Windows service handling logic so that the service doesn't
transition to the STOPPED state until the beater is terminated.

Before this patch, a Beats service would report to be STOPPED as soon
as it received the stop request. This causes some problems during service
restarts, as the new service would start while the old one was still cleaning
up.

Fixes elastic#18914

(cherry picked from commit f3ab7c7)
@adriansr adriansr added the v7.9.0 label Jun 5, 2020
adriansr added a commit to adriansr/beats that referenced this pull request Jun 5, 2020
Update the Windows service handling logic so that the service doesn't
transition to the STOPPED state until the beater is terminated.

Before this patch, a Beats service would report to be STOPPED as soon
as it received the stop request. This causes some problems during service
restarts, as the new service would start while the old one was still cleaning
up.

Fixes elastic#18914

(cherry picked from commit f3ab7c7)
@adriansr adriansr added the v7.8.0 label Jun 5, 2020
adriansr added a commit to adriansr/beats that referenced this pull request Jun 5, 2020
Update the Windows service handling logic so that the service doesn't
transition to the STOPPED state until the beater is terminated.

Before this patch, a Beats service would report to be STOPPED as soon
as it received the stop request. This causes some problems during service
restarts, as the new service would start while the old one was still cleaning
up.

Fixes elastic#18914

(cherry picked from commit f3ab7c7)
@adriansr adriansr added the v7.7.2 label Jun 5, 2020
adriansr added a commit that referenced this pull request Jun 8, 2020
Update the Windows service handling logic so that the service doesn't
transition to the STOPPED state until the beater is terminated.

Before this patch, a Beats service would report to be STOPPED as soon
as it received the stop request. This causes some problems during service
restarts, as the new service would start while the old one was still cleaning
up.

Fixes #18914

(cherry picked from commit f3ab7c7)
adriansr added a commit that referenced this pull request Jun 8, 2020
Update the Windows service handling logic so that the service doesn't
transition to the STOPPED state until the beater is terminated.

Before this patch, a Beats service would report to be STOPPED as soon
as it received the stop request. This causes some problems during service
restarts, as the new service would start while the old one was still cleaning
up.

Fixes #18914

(cherry picked from commit f3ab7c7)
adriansr added a commit that referenced this pull request Jun 8, 2020
Update the Windows service handling logic so that the service doesn't
transition to the STOPPED state until the beater is terminated.

Before this patch, a Beats service would report to be STOPPED as soon
as it received the stop request. This causes some problems during service
restarts, as the new service would start while the old one was still cleaning
up.

Fixes #18914

(cherry picked from commit f3ab7c7)
melchiormoulin pushed a commit to melchiormoulin/beats that referenced this pull request Oct 14, 2020
Update the Windows service handling logic so that the service doesn't
transition to the STOPPED state until the beater is terminated.

Before this patch, a Beats service would report to be STOPPED as soon
as it received the stop request. This causes some problems during service
restarts, as the new service would start while the old one was still cleaning
up.

Fixes elastic#18914
leweafan pushed a commit to leweafan/beats that referenced this pull request Apr 28, 2023
Update the Windows service handling logic so that the service doesn't
transition to the STOPPED state until the beater is terminated.

Before this patch, a Beats service would report to be STOPPED as soon
as it received the stop request. This causes some problems during service
restarts, as the new service would start while the old one was still cleaning
up.

Fixes elastic#18914

(cherry picked from commit 78e355b)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Beats running as a windows service randomly fail to restart
3 participants