-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible race condition with creation of throttle sem (needs to exist before CF initializes) #184
Comments
calling this a bug because its a race condition, although simply starting CF after the I/O app seems to avoid hitting it most of the time, it's lurking there nonetheless. |
I'm running into this issue with the integration of BP + CF ... BP creates the throttle sem, and although CF is started after BP, it still hasn't been created by the time CF checks for it so it fails out. Can hack it to get around the race for now, but we probably need an actual solution to this. |
This old issue is becoming a real pain point for BP startup, it hits this race quite often and CF bails out. I'm going to submit a PR with a possible workaround. |
Just wait until the sem is created? Loop w/ a 1 hz sleep? |
Yes, pretty much.... I had been locally using a patched/hacked version of CF with a sleep but now I'm making some github workflows and it would be much cleaner if there was something in the main line to deal with the sem creation. |
Adds a retry loop around OS_CountSemGetIdByName, because if this sem is created by another app there may be some delay until the other app gets to the point where it creates the sem. This works around the race condition. A retry limit is also imposed so CF will not spin here forever.
Adds a retry loop around OS_CountSemGetIdByName, because if this sem is created by another app there may be some delay until the other app gets to the point where it creates the sem. This works around the race condition. A retry limit is also imposed so CF will not spin here forever.
Fix #184, work around throttle sem creation race
Describe the bug
CFE ES starts all applications in their own thread. Therefore, conceptually at least, all apps are starting at the same time.
If configured to use a throttle sem, the CF app expects that semaphore to be created before it starts. During startup, it will attempt to bind to that semaphore during CF_CFDP_InitEngine(), and if that fails, CF aborts (see #178).
Problem is, if the semaphore is created by another app, whether it be CI/TO or some other dedicated I/O app, there is no guarantee that the semaphore has been created before CF attempts to use it.
Secondary problem exists if the I/O app that owns the sem gets restarted or reloaded, the semaphore ID will likely change too. This may be recoverable by disabling the engine and re-enabling it (but haven't tested that).
To Reproduce
Its a race condition, so not readily reproducible.
Start CF before the app that creates the sem (still not guaranteed, but increases the chance the race will be lost)
Add an artificial delay during startup for the app that creates the sem (just further increases the chance the race will be lost)
Expected behavior
Should be guaranteed via sync mechanisms, or shouldn't be a hard error (e.g. maybe retry to bind later?)
Suggestion that CF might still start up but with the engine in a disabled state, so at least someone can correct the condition and enable the engine, rather than having CF abort/exit.
Adding a call to CFE_ES_WaitForStartupSync() before starting the engine might help too...
Code snips
CF/fsw/src/cf_cfdp.c
Lines 1014 to 1015 in 2a024d8
System observed on:
Ubuntu 21.10
Reporter Info
Joseph Hickey, Vantage Systems, Inc.
The text was updated successfully, but these errors were encountered: