Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Corrupted qcow image—docker & corectld unresponsive, qcow-tool crashes #116

Open
quinncomendant opened this issue May 9, 2017 · 0 comments

Comments

@quinncomendant
Copy link

My macbook pro crashed, and after rebooting I am unable to use docker. If I execute docker version on macOS or inside CoreOS (corectl ssh containerland), it stalls indefinitely. I'm also unable to corectl kill containerland or corectld stop. The only way to stop corectld is with kill -9 PID.

I've restored my var_lib_docker.img.qcow2 image from a backup and now everything is working again.

I'm trying to repair the corrupt var_lib_docker.img.qcow2 image using qcow-tool but it crashes:

Attempt using check:

[q@x] qcow-tool check ~/var/coreos/var_lib_docker.img.qcow2
qcow-tool: internal error, uncaught exception:
           (Invalid_argument "Cstruct.sub: [0,32768](32768) off=0 len=-12288")

Attempt using repair:

[q@x] qcow-tool repair ~/var/coreos/var_lib_docker.img.qcow2
qcow-tool: [INFO] Zeroing existing refcount table
qcow-tool: [INFO] Incrementing refcount of the refcount table clusters
qcow-tool: [INFO] Incrementing refcount of the header
qcow-tool: [INFO] Incrementing refcount of the 1 L1 table clusters starting at 2
qcow-tool: [INFO] Incrementing refcount of the data clusters
qcow-tool: internal error, uncaught exception:
           "Assert_failure lib/qcow.ml:390:13"

So we've got several issues here:

  • coreos doesn't verify volume (I would expect something like fsck to run if last shutdown was unexpected)
  • corectld can't recover when corrupt volume is used
  • qcow-tool crashes when running check and repair

These are big issues, so I'm not expecting it's easy to fix, but I'm just reporting here in case others find this useful.

Here's a pastebin of the boot log.

Here's the status of corectl:

Server:
 Version:	0.7.18
 Go Version:	go1.7.3
 Built:		Sat Nov 12 12:31:50 GMT 2016
 OS/Arch:	darwin/amd64

 Pid:		9730
 Uptime:	1 minute ago

Activity:
 Active VMs:	1
 Total Memory:	1024
 Total vCores:	1

 UUID:		11111111-1111-1111-1111-111111111111
  Name:		containerland
  Version:	1381.1.0
  Channel:	beta
  vCPUs:	1
  Memory (MB):	1024
  Pid:		9759
  Uptime:	1 minute ago
  Sees World:	true
  cloud-config:	/Users/q/var/coreos/docker-only-with-persistent-storage.txt
  Network:
    eth0:	192.168.64.2
  Volumes:
   /dev/vda	/Users/q/var/coreos/var_lib_docker.img.qcow2,format=qcow2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant