Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nomad binary crashed due to go-getter error #14668

Closed
daniel1302 opened this issue Sep 22, 2022 · 4 comments
Closed

Nomad binary crashed due to go-getter error #14668

daniel1302 opened this issue Sep 22, 2022 · 4 comments
Assignees
Labels
stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/artifact theme/crash type/bug

Comments

@daniel1302
Copy link

daniel1302 commented Sep 22, 2022

Nomad version

Nomad v1.3.1 (2b054e38e91af964d1235faa98c286ca3f527e56)

Operating system and Environment details

Linux ubuntu

Issue

When i deploy incorrect job, the nomad client is failing and going down. Then i have to remove job + remove nomad data on the broken client for that allocation to be able start it.

Reproduction steps

I suspect it is caused by incorrect artifact - ...789//data-node

{
              "GetterSource": "s3::https://s3-eu-west-2.amazonaws.com/vegacapsule-123456789//data-node",
              "GetterOptions": {
                "aws_access_key_secret": "XXXXXXXXXXXXXX",
                "region": "eu-west-2",
                "aws_access_key_id": "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
              },
              "GetterHeaders": null,
              "GetterMode": "file",
              "RelativeDest": "/tmp/local/vega/bin/data-node"
            },

Expected Result

Job is failing

Actual Result

Node is failing

Job file (if appropriate)

Nomad Server logs (if appropriate)

Nomad Client logs (if appropriate)

2022-09-22T18:20:48.591Z [INFO]  client.driver_mgr.exec: starting task: driver=exec driver_cfg="{Command:bash Args:[-c /pre-start.sh] ModePID: Mo>
Sep 22 18:20:48 n01.xxxxxxxxxxx.xyz nomad[2215748]: panic: runtime error: index out of range [2] with length 2
Sep 22 18:20:48 n01.xxxxxxxxxxx.xyz nomad[2215748]: goroutine 319 [running]:
Sep 22 18:20:48 n01.xxxxxxxxxxx.xyz nomad[2215748]: github.com/hashicorp/go-getter.(*S3Getter).parseUrl(0xc000d90a79, 0xc0007ab9e0)
Sep 22 18:20:48 n01.xxxxxxxxxxx.xyz nomad[2215748]:         github.com/hashicorp/go-getter@v1.6.1/get_s3.go:269 +0xa25
Sep 22 18:20:48 n01.xxxxxxxxxxx.xyz nomad[2215748]: github.com/hashicorp/go-getter.(*S3Getter).GetFile(0xc0002d9230, {0xc0008d2080, 0x1a}, 0x8)
Sep 22 18:20:48 n01.xxxxxxxxxxx.xyz nomad[2215748]:         github.com/hashicorp/go-getter@v1.6.1/get_s3.go:170 +0xe5
Sep 22 18:20:48 n01.xxxxxxxxxxx.xyz nomad[2215748]: github.com/hashicorp/go-getter.(*Client).Get(0xc0002fc2c0)
Sep 22 18:20:48 n01.xxxxxxxxxxx.xyz nomad[2215748]:         github.com/hashicorp/go-getter@v1.6.1/client.go:270 +0xd50
Sep 22 18:20:48 n01.xxxxxxxxxxx.xyz nomad[2215748]: github.com/hashicorp/nomad/client/allocrunner/taskrunner/getter.(*Getter).GetArtifact(0xc00095fb30, {0x2e863d0, 0xc0009202a0}, 0xc0006b4740)
Sep 22 18:20:48 n01.xxxxxxxxxxx.xyz nomad[2215748]:         github.com/hashicorp/nomad/client/allocrunner/taskrunner/getter/getter.go:69 +0x1f1
Sep 22 18:20:48 n01.xxxxxxxxxxx.xyz nomad[2215748]: github.com/hashicorp/nomad/client/allocrunner/taskrunner.(*artifactHook).doWork(0xc0008c6b10, 0xc000d8bd00, 0xc0008a7590, 0x0, 0xc0006c2fd0, 0x1031b4>
Sep 22 18:20:48 n01.xxxxxxxxxxx.xyz nomad[2215748]:         github.com/hashicorp/nomad/client/allocrunner/taskrunner/artifact_hook.go:45 +0x3d8
Sep 22 18:20:48 n01.xxxxxxxxxxx.xyz nomad[2215748]: created by github.com/hashicorp/nomad/client/allocrunner/taskrunner.(*artifactHook).Prestart.func1
Sep 22 18:20:48 n01.xxxxxxxxxxx.xyz nomad[2215748]:         github.com/hashicorp/nomad/client/allocrunner/taskrunner/artifact_hook.go:101 +0xad
Sep 22 18:20:48 n01.xxxxxxxxxxx.xyz nomad[2215748]: panic: runtime error: index out of range [2] with length 2
Sep 22 18:20:48 n01.xxxxxxxxxxx.xyz nomad[2215748]: goroutine 318 [running]:
Sep 22 18:20:48 n01.xxxxxxxxxxx.xyz nomad[2215748]: github.com/hashicorp/go-getter.(*S3Getter).parseUrl(0xc000b42189, 0xc000b93050)
Sep 22 18:20:48 n01.xxxxxxxxxxx.xyz nomad[2215748]:         github.com/hashicorp/go-getter@v1.6.1/get_s3.go:269 +0xa25
Sep 22 18:20:48 n01.xxxxxxxxxxx.xyz nomad[2215748]: github.com/hashicorp/go-getter.(*S3Getter).GetFile(0xc000a02170, {0xc000824520, 0x1a}, 0x8)
Sep 22 18:20:48 n01.xxxxxxxxxxx.xyz nomad[2215748]:         github.com/hashicorp/go-getter@v1.6.1/get_s3.go:170 +0xe5
Sep 22 18:20:48 n01.xxxxxxxxxxx.xyz nomad[2215748]: github.com/hashicorp/go-getter.(*Client).Get(0xc000466000)
Sep 22 18:20:48 n01.xxxxxxxxxxx.xyz nomad[2215748]:         github.com/hashicorp/go-getter@v1.6.1/client.go:270 +0xd50
Sep 22 18:20:48 n01.xxxxxxxxxxx.xyz nomad[2215748]: github.com/hashicorp/nomad/client/allocrunner/taskrunner/getter.(*Getter).GetArtifact(0xc00068a4e0, {0x2e863d0, 0xc0009202a0}, 0xc0006b4700)
Sep 22 18:20:48 n01.xxxxxxxxxxx.xyz nomad[2215748]:         github.com/hashicorp/nomad/client/allocrunner/taskrunner/getter/getter.go:69 +0x1f1
Sep 22 18:20:48 n01.xxxxxxxxxxx.xyz nomad[2215748]: github.com/hashicorp/nomad/client/allocrunner/taskrunner.(*artifactHook).doWork(0xc0008c6b10, 0xc000d8bd00, 0xc0008a7590, 0xc0007cefd0, 0x7e00aa, 0x0>
Sep 22 18:20:48 n01.xxxxxxxxxxx.xyz nomad[2215748]:         github.com/hashicorp/nomad/client/allocrunner/taskrunner/artifact_hook.go:45 +0x3d8
Sep 22 18:20:48 n01.xxxxxxxxxxx.xyz nomad[2215748]: created by github.com/hashicorp/nomad/client/allocrunner/taskrunner.(*artifactHook).Prestart.func1
Sep 22 18:20:48 n01.xxxxxxxxxxx.xyz nomad[2215748]:         github.com/hashicorp/nomad/client/allocrunner/taskrunner/artifact_hook.go:101 +0xad
@lgfa29
Copy link
Contributor

lgfa29 commented Sep 23, 2022

Thanks for the report @daniel1302!

We are investigating this problem and we will get back to you once we have more information.

@lgfa29 lgfa29 added theme/crash theme/artifact stage/accepted Confirmed, and intend to work on. No timeline committment though. labels Sep 23, 2022
@lgfa29 lgfa29 self-assigned this Sep 23, 2022
@tgross tgross added this to Needs Triage in Nomad - Community Issues Triage via automation Sep 30, 2022
@tgross tgross moved this from Needs Triage to Triaging in Nomad - Community Issues Triage Sep 30, 2022
@tgross tgross moved this from Triaging to In Progress in Nomad - Community Issues Triage Sep 30, 2022
@tgross
Copy link
Member

tgross commented Nov 18, 2022

@lgfa29 should this have been closed by #14696 which we shipped in 1.4.0 and backported to 1.3.6?

@lgfa29
Copy link
Contributor

lgfa29 commented Nov 18, 2022

Oh, that's right, thanks!

@lgfa29 lgfa29 closed this as completed Nov 18, 2022
Nomad - Community Issues Triage automation moved this from In Progress to Done Nov 18, 2022
@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 19, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/artifact theme/crash type/bug
Projects
Development

No branches or pull requests

3 participants