Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix detection if the collector is running as service on Windows #9042

Merged
merged 17 commits into from
Dec 12, 2023

Conversation

pjanotti
Copy link
Contributor

@pjanotti pjanotti commented Dec 5, 2023

Description:
Fix run as a service detection on Windows. Instead of trying to detect if it is a service or not, for which both svc.IsAnInteractiveSession() and svc.IsWindowsService() are somehow broken, try to run as a Windows service, if that fails fallback to run as an interactive process. This follows a recommendation of the Windows service API documentation. The new code calls svc.Run and in case of error checks for windows.ERROR_FAILED_SERVICE_CONTROLLER_CONNECT. If this is the error the application can safely assume that it is not running as a service.

The duration of a call to svc.Run failing with this error was below 3 microseconds in the current GH runner and below 5 microseconds on my box. While this value seems fine for startup I'm keeping the NO_WINDOWS_SERVICE option instead of deprecating it (it doesn't seem worth the trouble of deprecating it).

Link to tracking Issue:
Fix #7350

Testing:
Fix tested on the Splunk fork that deploys the collector as a service and as an interactive process on Windows containers.

Documentation:
Added changelog.

@pjanotti pjanotti requested a review from a team as a code owner December 5, 2023 04:26
Copy link

codecov bot commented Dec 5, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (6d5b45f) 91.49% compared to head (6de2c63) 91.49%.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #9042   +/-   ##
=======================================
  Coverage   91.49%   91.49%           
=======================================
  Files         316      316           
  Lines       17184    17184           
=======================================
  Hits        15723    15723           
  Misses       1165     1165           
  Partials      296      296           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pjanotti
Copy link
Contributor Author

pjanotti commented Dec 5, 2023

Unrelated test failure https://github.com/open-telemetry/opentelemetry-collector/actions/runs/7103768222/job/19337227044?pr=9042#step:7:159:~:text=%2D%2D%2D%20FAIL%3A%20TestZeroSizeWithConsumers%20(0.00s,FAIL

--- FAIL: TestZeroSizeWithConsumers (0.00s)
    bounded_memory_queue_test.go:244: 
        	Error Trace:	/home/runner/work/opentelemetry-collector/opentelemetry-collector/exporter/exporterhelper/internal/bounded_memory_queue_test.go:244
        	Error:      	Received unexpected error:
        	            	sending queue is full
        	Test:       	TestZeroSizeWithConsumers
FAIL

@pjanotti
Copy link
Contributor Author

pjanotti commented Dec 6, 2023

@bogdandrutu
Copy link
Member

Change (code wise) looks good, but I am not sure about the behavior. Do we have someone with windows experience to confirm this? Maybe @reyang can find someone?

@atoulme
Copy link
Contributor

atoulme commented Dec 7, 2023

Please rebase to fix the build.

@atoulme
Copy link
Contributor

atoulme commented Dec 7, 2023

I can vouch for Paulo's expertise regarding Windows. He is also maintainer of the .net SDK.

@pjanotti
Copy link
Contributor Author

@bogdandrutu bogdandrutu merged commit f8b1c65 into open-telemetry:main Dec 12, 2023
32 checks passed
@github-actions github-actions bot added this to the next release milestone Dec 12, 2023
@pjanotti pjanotti deleted the fix-run-as-service-detection branch December 12, 2023 17:33
sokoide pushed a commit to sokoide/opentelemetry-collector that referenced this pull request Dec 18, 2023
…-telemetry#9042)

**Description:**
Fix run as a service detection on Windows. Instead of trying to detect
if it is a service or not, for which both `svc.IsAnInteractiveSession()`
and `svc.IsWindowsService()` are somehow broken, try to run as a Windows
service, if that fails fallback to run as an interactive process. This
follows a recommendation of the Windows [service API
documentation](https://learn.microsoft.com/en-us/windows/win32/api/winsvc/nf-winsvc-startservicectrldispatchera#return-value).
The new code calls `svc.Run` and in case of error checks for
`windows.ERROR_FAILED_SERVICE_CONTROLLER_CONNECT`. If this is the error
the application can safely assume that it is not running as a service.

The duration of a call to `svc.Run` failing with this error was below 3
microseconds in the current GH runner and below 5 microseconds on my
box. While this value seems fine for startup I'm keeping the
`NO_WINDOWS_SERVICE` option instead of deprecating it (it doesn't seem
worth the trouble of deprecating it).

**Link to tracking Issue:**
Fix open-telemetry#7350

**Testing:**
Fix tested on the Splunk fork that deploys the collector as a service
and as an interactive process on Windows containers.

**Documentation:**
Added changelog.
codeboten pushed a commit that referenced this pull request Aug 1, 2024
I'm volunteering myself to own the tier 2 support for windows/amd64.
This will put the windows/amd64 on par with the other tier 2 platforms
by having a specific person owning it.

Some of Windows related things that I contribute(d):

* Codeowner of various Windows related components in contrib
(pkg/winperfcounters, activedirectorydsreceiver, iisreceiver,
windowseventlogreceiver, and windowsperfcountersreceiver)
* Fixing Windows related issues (e.g.: #9042, #9689, #9726,
open-telemetry/opentelemetry-collector-contrib#30743)
* MSI on the collector releases repo
(open-telemetry/opentelemetry-collector-releases#560)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Alternative method to configure NO_WINDOWS_SERVICE
4 participants