Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue in Recover Task. #55

Merged
merged 1 commit into from
Jan 8, 2021
Merged

Fix issue in Recover Task. #55

merged 1 commit into from
Jan 8, 2021

Conversation

shishir-a412ed
Copy link
Contributor

@shishir-a412ed shishir-a412ed commented Jan 7, 2021

This PR fixes Issue # 3 described in this comment

Hashicorp/nomad issue: hashicorp/nomad#9750

@shishir-a412ed shishir-a412ed self-assigned this Jan 7, 2021
@shishir-a412ed shishir-a412ed requested a review from a team January 7, 2021 18:25
@shishir-a412ed shishir-a412ed changed the title Fix issue in Recover Task. WIP: Fix issue in Recover Task. Jan 7, 2021
@shishir-a412ed shishir-a412ed changed the title WIP: Fix issue in Recover Task. Fix issue in Recover Task. Jan 8, 2021
@shishir-a412ed
Copy link
Contributor Author

@shivdudhani Can you please review it?

  1. This PR fixes the error in decoding driver config
Dec 17 18:47:14 ip-10-102-98-114 nomad[27030]:  client.driver_mgr.containerd-driver: HELLO: RecoverTask: Failed to decode driver config: driver=containerd-driver @module=containerd-driver timestamp=2020-12-17T18:47:14.167Z
Dec 17 18:47:14 ip-10-102-98-114 nomad[27030]: client.alloc_runner.task_runner: error recovering task; cleaning up: alloc_id=2d6b365f-cc89-b230-2af9-a37ccbf0f6c8 task=adaas-task error="rpc error: code = Unknown desc = failed to decode driver config: EOF" task_id=2d6b365f-cc89-b230-2af9-a37ccbf0f6c8/adaas-task/43e03bbd

I have removed the logic for decoding driver config from RecoverTask since it's not needed at all.
For recovering the task, we get the container name from the task handle supplied by the nomad client during the RecoverTask API call. That container name is sufficient to get the container, get the task, check the status of the task and reattach to the existing task.

I also checked the docker driver implementation for RecoverTask and it doesn't decode the driver config as well.

  1. For the network issue network hook fails after client restart w/ non-Docker driver hashicorp/nomad#9750, the fix: PR # 9757 would be going in the hashicorp/nomad codebase, so there is no change needed in the containerd-driver.

@shishir-a412ed shishir-a412ed merged commit 6416bf6 into master Jan 8, 2021
@shishir-a412ed shishir-a412ed deleted the fix_issue branch January 8, 2021 20:16
@github-actions github-actions bot locked and limited conversation to collaborators Jan 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants