Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Qemu driver failed to load image from local filesystem #518

Closed
legal90 opened this issue Nov 30, 2015 · 5 comments
Closed

Qemu driver failed to load image from local filesystem #518

legal90 opened this issue Nov 30, 2015 · 5 comments

Comments

@legal90
Copy link
Contributor

legal90 commented Nov 30, 2015

Nomad version: 0.2.1
OS: Ubuntu 14.04 x64 (running as Vagrant VM in Parallels Desktop on OS X 10.11 host)

Symptoms

Nomad fails to run qemu job if "artifact_source" is pointing to the local file. Symlink is created, but seems like it is not available for nomad job running in chroot.

Job configuration:

job "qemu-test" {
    datacenters = ["dc1"]

    constraint {
        attribute = "$attr.kernel.name"
        value = "linux"
    }
    update {
        stagger = "10s"
        max_parallel = 1
    }

    group "test" {
        task "debian-image" {
            driver = "qemu"

            config = {
                artifact_source = "/vagrant/debian-hurd-20150320.img"
            }

            resources {
                cpu = 500 # 500 Mhz
                memory = 256 # 256MB
            }
        }
    }
}

Nomad run output:

# nomad run qemu-test.nomad
==> Monitoring evaluation "c0ffeff3-7b94-169d-c29c-c9ec01982c6c"
    Evaluation triggered by job "qemu-test"
    Allocation "ccba4d82-0075-f033-1491-a85052471789" created: node "956783e3-fbc0-dff5-5bed-039669da9327", group "test"
    Allocation "ccba4d82-0075-f033-1491-a85052471789" status changed: "pending" -> "running" ({"debian-image":{"State":"pending","Events":[{"Type":"Started","Time":1448908748485641441,"DriverError":"","ExitCode":0,"Signal":0,"Message":"","KillError":""},{"Type":"Terminated","Time":1448908748537751911,"DriverError":"","ExitCode":1,"Signal":0,"Message":"","KillError":""}]}})
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "c0ffeff3-7b94-169d-c29c-c9ec01982c6c" finished with status "complete"
root@nomad:/vagrant# nomad run qemu-test.nomad
==> Monitoring evaluation "e6bec814-9e9e-e667-7e45-9686027223bf"
    Evaluation triggered by job "qemu-test"
    Allocation "1e84a045-65d6-aa51-8c4c-eed06bc3e588" created: node "7fd0295b-581a-67f8-0fbe-5a72b9a36ad4", group "test"
    Allocation "1e84a045-65d6-aa51-8c4c-eed06bc3e588" status changed: "pending" -> "running" ({"debian-image":{"State":"pending","Events":[{"Type":"Started","Time":1448909199002729488,"DriverError":"","ExitCode":0,"Signal":0,"Message":"","KillError":""},{"Type":"Terminated","Time":1448909199058004078,"DriverError":"","ExitCode":1,"Signal":0,"Message":"","KillError":""}]}})
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "e6bec814-9e9e-e667-7e45-9686027223bf" finished with status "complete"

Nomad agent output:

# nomad agent -dev
<...>
    2015/11/30 20:46:38 [DEBUG] Starting QemuVM command: "qemu-system-x86_64 -machine type=pc,accel=tcg -name debian-hurd-20150320.img -m 256M -drive file=/tmp/NomadClient556001122/1e84a045-65d6-aa51-8c4c-eed06bc3e588/debian-image/local/debian-hurd-20150320.img -nodefconfig -nodefaults -nographic"
    2015/11/30 20:46:39 [INFO] Started new QemuVM: debian-hurd-20150320.img
    2015/11/30 20:46:39 [DEBUG] client: updated allocations at index 9 (1 allocs)
    2015/11/30 20:46:39 [DEBUG] client: allocs: (added 0) (removed 0) (updated 1) (ignore 0)
    2015/11/30 20:46:39 [ERR] client: failed to complete task 'debian-image' for alloc '1e84a045-65d6-aa51-8c4c-eed06bc3e588': Wait returned exit code 1, signal 0, and error <nil>
    2015/11/30 20:46:39 [INFO] client: Restarting Task: debian-image
    2015/11/30 20:46:39 [DEBUG] client: Sleeping for 15s before restarting Task debian-image
    2015/11/30 20:46:39 [DEBUG] client: updated allocations at index 10 (1 allocs)
    2015/11/30 20:46:39 [DEBUG] client: allocs: (added 0) (removed 0) (updated 1) (ignore 0)
    2015/11/30 20:46:39 [DEBUG] http: Request /v1/evaluation/e6bec814-9e9e-e667-7e45-9686027223bf (120.724µs)
    2015/11/30 20:46:39 [DEBUG] http: Request /v1/evaluation/e6bec814-9e9e-e667-7e45-9686027223bf/allocations (149.427µs)
    2015/11/30 20:46:54 [DEBUG] Starting QemuVM command: "qemu-system-x86_64 -machine type=pc,accel=tcg -name debian-hurd-20150320.img -m 256M -drive file=/tmp/NomadClient556001122/1e84a045-65d6-aa51-8c4c-eed06bc3e588/debian-image/local/debian-hurd-20150320.img -nodefconfig -nodefaults -nographic"
    2015/11/30 20:46:54 [INFO] Started new QemuVM: debian-hurd-20150320.img
    2015/11/30 20:46:54 [DEBUG] client: updated allocations at index 11 (1 allocs)
    2015/11/30 20:46:54 [DEBUG] client: allocs: (added 0) (removed 0) (updated 1) (ignore 0)
    2015/11/30 20:46:54 [ERR] client: failed to complete task 'debian-image' for alloc '1e84a045-65d6-aa51-8c4c-eed06bc3e588': Wait returned exit code 1, signal 0, and error <nil>
    2015/11/30 20:46:54 [INFO] client: Restarting Task: debian-image
    2015/11/30 20:46:54 [DEBUG] client: Sleeping for 15s before restarting Task debian-image
    2015/11/30 20:46:54 [DEBUG] client: updated allocations at index 12 (1 allocs)
    2015/11/30 20:46:54 [DEBUG] client: allocs: (added 0) (removed 0) (updated 1) (ignore 0)
    2015/11/30 20:47:09 [DEBUG] Starting QemuVM command: "qemu-system-x86_64 -machine type=pc,accel=tcg -name debian-hurd-20150320.img -m 256M -drive file=/tmp/NomadClient556001122/1e84a045-65d6-aa51-8c4c-eed06bc3e588/debian-image/local/debian-hurd-20150320.img -nodefconfig -nodefaults -nographic"
    2015/11/30 20:47:09 [INFO] Started new QemuVM: debian-hurd-20150320.img
    2015/11/30 20:47:09 [DEBUG] client: updated allocations at index 13 (1 allocs)
    2015/11/30 20:47:09 [DEBUG] client: allocs: (added 0) (removed 0) (updated 1) (ignore 0)
    2015/11/30 20:47:09 [ERR] client: failed to complete task 'debian-image' for alloc '1e84a045-65d6-aa51-8c4c-eed06bc3e588': Wait returned exit code 1, signal 0, and error <nil>
    2015/11/30 20:47:09 [INFO] client: Restarting Task: debian-image
    2015/11/30 20:47:09 [DEBUG] client: Sleeping for 29.254221776s before restarting Task debian-image
    2015/11/30 20:47:09 [DEBUG] client: updated allocations at index 14 (1 allocs)
    2015/11/30 20:47:09 [DEBUG] client: allocs: (added 0) (removed 0) (updated 1) (ignore 0)
<...>

Job stderr

# less /tmp/NomadClient556001122/1e84a045-65d6-aa51-8c4c-eed06bc3e588/debian-image/local/debian-image.stderr
<...>
qemu-system-x86_64: -drive file=/tmp/NomadClient556001122/1e84a045-65d6-aa51-8c4c-eed06bc3e588/debian-image/local/debian-hurd-20150320.img: could not open disk image /tmp/NomadClient556001122/1e84a045-65d6-aa51-8c4c-eed06bc3e588/debian-image/local/debian-hurd-20150320.img: Could not open '/tmp/NomadClient556001122/1e84a045-65d6-aa51-8c4c-eed06bc3e588/debian-image/local/debian-hurd-20150320.img': No such file or directory
qemu-system-x86_64: -drive file=/tmp/NomadClient556001122/1e84a045-65d6-aa51-8c4c-eed06bc3e588/debian-image/local/debian-hurd-20150320.img: could not open disk image /tmp/NomadClient556001122/1e84a045-65d6-aa51-8c4c-eed06bc3e588/debian-image/local/debian-hurd-20150320.img: Could not open '/tmp/NomadClient556001122/1e84a045-65d6-aa51-8c4c-eed06bc3e588/debian-image/local/debian-hurd-20150320.img': No such file or directory
qemu-system-x86_64: -drive file=/tmp/NomadClient556001122/1e84a045-65d6-aa51-8c4c-eed06bc3e588/debian-image/local/debian-hurd-20150320.img: could not open disk image /tmp/NomadClient556001122/1e84a045-65d6-aa51-8c4c-eed06bc3e588/debian-image/local/debian-hurd-20150320.img: Could not open '/tmp/NomadClient556001122/1e84a045-65d6-aa51-8c4c-eed06bc3e588/debian-image/local/debian-hurd-20150320.img': No such file or directory

Actually, image file exists, but it is a symlink:

# ls -la /tmp/NomadClient556001122/1e84a045-65d6-aa51-8c4c-eed06bc3e588/debian-image/local/debian-hurd-20150320.img
lrwxrwxrwx 1 root root 33 Nov 30 20:48 /tmp/NomadClient556001122/1e84a045-65d6-aa51-8c4c-eed06bc3e588/debian-image/local/debian-hurd-20150320.img -> /vagrant/debian-hurd-20150320.img
@cbednarski
Copy link
Contributor

Thanks for the report. I'm curious if this is a side effect of running with Vagrant because the file is mounted from the host rather than copied, so there is an extra level of indirection. I have had difficulty with linking to files like this in the past.

My guess:

  1. Host has file /path/debian.img
  2. Guest has mounted file /vagrant/debian.img
  3. Guest symlink /tmp/NomadClient.../debian.img pointing to /vagrant/debian.img.
  4. Guest chroot link /tmp/NomadClient.../debian.img pointing to /vagrant/debian.img but /vagrant/ doesn't exist in the chroot.

The other possibility is that this is weird because of a double-link problem (maybe trying to hardlink a symlink) where:

  1. as above
  2. as above
  3. as above
  4. Guest chroot link /tmp/NomadClient.../debian.img pointing to /tmp/NomadClient.../debian.img.

I think to fix this we should link directly to the file on disk instead of using the alloc dir indirection.

@legal90
Copy link
Contributor Author

legal90 commented Dec 4, 2015

@cbednarski Unfortunately, it doesn't work even if I put image to the /etc. There is the same error:

# Job stderr
qemu-system-x86_64: -drive file=/tmp/NomadClient719297766/ccef0b8f-86a0-31b6-5254-fd7bd683c682/debian-image/local/debian-hurd-20150320.img: could not open disk image /tmp/NomadClient719297766/ccef0b8f-86a0-31b6-5254-fd7bd683c682/debian-image/local/debian-hurd-20150320.img: Could not open '/tmp/NomadClient719297766/ccef0b8f-86a0-31b6-5254-fd7bd683c682/debian-image/local/debian-hurd-20150320.img': No such file or directory

However, this file exists and it is accessible from chroot:

$ ls /etc/debian-hurd-20150320.img
/etc/debian-hurd-20150320.img

$ ls /tmp/NomadClient719297766/ccef0b8f-86a0-31b6-5254-fd7bd683c682/debian-image/etc/debian-hurd-20150320.img
/tmp/NomadClient719297766/ccef0b8f-86a0-31b6-5254-fd7bd683c682/debian-image/etc/debian-hurd-20150320.img

@dadgar dadgar added this to the v0.3.0 milestone Dec 8, 2015
@dadgar
Copy link
Contributor

dadgar commented Dec 8, 2015

The file is being linked into the chroot, so I wonder if the Qemu driver is
failing to follow the link. This will require some investigation.

On Fri, Dec 4, 2015 at 12:57 PM, Mikhail Zholobov notifications@github.com
wrote:

@cbednarski https://github.com/cbednarski Unfortunately, it doesn't
work even if I put image to the /etc. There is the same error:

Job stderr

qemu-system-x86_64: -drive file=/tmp/NomadClient719297766/ccef0b8f-86a0-31b6-5254-fd7bd683c682/debian-image/local/debian-hurd-20150320.img: could not open disk image /tmp/NomadClient719297766/ccef0b8f-86a0-31b6-5254-fd7bd683c682/debian-image/local/debian-hurd-20150320.img: Could not open '/tmp/NomadClient719297766/ccef0b8f-86a0-31b6-5254-fd7bd683c682/debian-image/local/debian-hurd-20150320.img': No such file or directory

However, this file is accessible from chroot:

$ ls /tmp/NomadClient719297766/ccef0b8f-86a0-31b6-5254-fd7bd683c682/debian-image/etc/debian-hurd-20150320.img
/tmp/NomadClient719297766/ccef0b8f-86a0-31b6-5254-fd7bd683c682/debian-image/etc/debian-hurd-20150320.img


Reply to this email directly or view it on GitHub
#518 (comment).

@pka
Copy link

pka commented Jan 5, 2016

Same here on bare metal.

job stderr:

qemu-system-x86_64: -drive file=/var/lib/nomad/client1/alloc/291af8d7-6609-c8a0-d400-a01c998ee3d7/jessieqemu/local/debian8.img: could not open disk image /var/lib/nomad/client1/alloc/291af8d7-6609-c8a0-d400-a01c998ee3d7/jessieqemu/local/debian8.img: Could not open '/var/lib/nomad/client1/alloc/291af8d7-6609-c8a0-d400-a01c998ee3d7/jessieqemu/local/debian8.img': No such file or directory
# ls -l /var/lib/nomad/client1/alloc/291af8d7-6609-c8a0-d400-a01c998ee3d7/jessieqemu/local/debian8.img
lrwxrwxrwx 1 root root 16 Jan  5 21:36 /var/lib/nomad/client1/alloc/291af8d7-6609-c8a0-d400-a01c998ee3d7/jessieqemu/local/debian8.img -> /tmp/debian8.img

# qemu-img info /var/lib/nomad/client1/alloc/291af8d7-6609-c8a0-d400-a01c998ee3d7/jessieqemu/local/debian8.img
image: /var/lib/nomad/client1/alloc/291af8d7-6609-c8a0-d400-a01c998ee3d7/jessieqemu/local/debian8.img
file format: raw
virtual size: 2.0G (2147483648 bytes)
disk size: 1.3G

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 27, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants