Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Old job with Computing status in Jobs List. Invalid Ready state of activities. #108

Closed
pwalski opened this issue Apr 10, 2024 · 3 comments · Fixed by golemfactory/yagna#3211
Assignees
Labels
bug Something isn't working

Comments

@pwalski
Copy link
Contributor

pwalski commented Apr 10, 2024

The problem:

  • old job can stay in a Computing status
  • the said Computing job does not follow list ordering (by date)

Reproduction scenario:

  • start MockUI, run a job (in my case Automatic example)
  • kill ya-provider process (note ya-runtime-ai does not stop when provider is killed)
  • stop yagna, and click Start again
  • Click List Jobs. There should be a job in a Computing state

Cause:
After killing ya-provider activity's state stays as Ready, and at no point later it gets changed.

Attachment contains Golem module's data dir.
When imported and started it should display in Jobs List an activity
6ca83ac1079122b87f460587e2c344f1909c090fbc13c74a2a7474fc12140cb1 with Computing status
golem-data.tar.gz

Solution proposal:

This solution allow to destroy activities, but doesn't close possibility of bringing activities back to live after restart.

@pwalski pwalski added the bug Something isn't working label Apr 10, 2024
@pwalski pwalski self-assigned this Apr 10, 2024
@pwalski
Copy link
Contributor Author

pwalski commented Apr 11, 2024

ya-provider does not support any process groups / jobs on Windows
https://github.com/golemfactory/yagna/blob/f189daee32819d0b44d3a6809e2ef4f617106042/utils/process/src/lib.rs#L66

Maybe we should move Windows process Job related code from ya-runtime-ai to some lib and use it in ya-provider?
https://github.com/golemfactory/ya-runtime-ai/blob/d5683703359165543c116e7c55e679a71a7e2c74/src/main.rs#L290

It could be moved into dedicated crate in lets say exe-unit/components/proc together with this module https://github.com/golemfactory/yagna/tree/staszek/gamerhash-combined/exe-unit/components/counters/src/os/process because it always looked odd there (and ya-counters will use ya-proc when os feature id enabled).

@pwalski pwalski changed the title Old job with Computing status in Jobs List Old job with Computing status in Jobs List. Invalid Ready state of activities. Apr 11, 2024
@pwalski pwalski removed their assignment Apr 11, 2024
@pwalski pwalski self-assigned this Apr 19, 2024
@pwalski
Copy link
Contributor Author

pwalski commented Apr 25, 2024

@nieznanysprawiciel market.db from data dir I use to reproduce the bug contains an agreement in Approved state with an expired valid_to field.
Shouldn't it get changed too?
State Terminated seems most suitable, because Cancelled has a comment /// Cancelled by a Requestor, but other agreements also have Terminated state, so it would not indicate this one got interrupted.

@pwalski
Copy link
Contributor Author

pwalski commented May 7, 2024

NVM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
1 participant