Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

does zfs not use guids for drive (cache/spare) #2155

Closed
prometheanfire opened this issue Mar 3, 2014 · 6 comments
Closed

does zfs not use guids for drive (cache/spare) #2155

prometheanfire opened this issue Mar 3, 2014 · 6 comments
Labels
Type: Documentation Indicates a requested change to the documentation

Comments

@prometheanfire
Copy link
Contributor

Caused it to break on import, drive ordering got all sorts of fucked up.

I added what are now cryptb1 and cryptc1

I don't know of a way to remove the cache devices.

The spare and cache devices are both spare/cache AND in the raidz3 somehow...

# zpool status 
  pool: storage-pool
 state: ONLINE
status: One or more devices could not be used because the label is missing or
    invalid.  Sufficient replicas exist for the pool to continue
    functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: http://zfsonlinux.org/msg/ZFS-8000-4J
  scan: resilvered 7.38M in 0h0m with 0 errors on Mon Mar  3 06:03:07 2014
config:

    NAME         STATE     READ WRITE CKSUM
    storage-pool  ONLINE       0     0     0
      raidz3-0   ONLINE       0     0     0
        cryptr2  ONLINE       0     0     0
        crypta2  ONLINE       0     0     0
        cryptj2  ONLINE       0     0     0
        cryptd2  ONLINE       0     0     0
        crypte2  ONLINE       0     0     0
        cryptf2  ONLINE       0     0     0
        cryptg2  ONLINE       0     0     0
        crypth2  ONLINE       0     0     0
        crypti2  ONLINE       0     0     0
        cryptk2  ONLINE       0     0     0
        cryptq2  ONLINE       0     0     0
        cryptl2  ONLINE       0     0     0
        cryptm2  ONLINE       0     0     0
        crypto2  ONLINE       0     0     0
        cryptp2  ONLINE       0     0     0
    logs
      mirror-1   ONLINE       0     0     0
        cryptb1  ONLINE       0     0     0
        cryptc1  ONLINE       0     0     0
    cache
      cryptq2    FAULTED      0     0     0  corrupted data
      cryptr2    FAULTED      0     0     0  corrupted data
    spares
      cryptl2    FAULTED   corrupted data
@prometheanfire
Copy link
Contributor Author

I was able to fix it by issuing an import with -d /dev/mapper, scrubbing now.

    NAME         STATE     READ WRITE CKSUM
    storage-pool  ONLINE       0     0     0
      raidz3-0   ONLINE       0     0     0
        cryptr2  ONLINE       0     0    17  (repairing)
        crypta2  ONLINE       0     0     0
        cryptj2  ONLINE       0     0     0
        cryptd2  ONLINE       0     0     0
        crypte2  ONLINE       0     0     0
        cryptf2  ONLINE       0     0     0
        cryptg2  ONLINE       0     0     0
        crypth2  ONLINE       0     0     0
        crypti2  ONLINE       0     0     0
        cryptk2  ONLINE       0     0     0
        cryptq2  ONLINE       0     0    15  (repairing)
        cryptl2  ONLINE       0     0     0
        cryptm2  ONLINE       0     0     0
        crypto2  ONLINE       0     0     0
        cryptp2  ONLINE       0     0     0
    logs
      mirror-1   ONLINE       0     0     0
        cryptb1  ONLINE       0     0     0
        cryptc1  ONLINE       0     0     0
    cache
      cryptb2    ONLINE       0     0     0
      cryptc2    ONLINE       0     0     0
    spares
      cryptn2    AVAIL   

@dweeezil
Copy link
Contributor

dweeezil commented Mar 3, 2014

@prometheanfire GUIDs are used for both cache and spares and can be used to remove them. To find the GUID of them, run zdb -dddd storage-pool 1 | egrep "l2cache|spares" and make note of the object numbers to the right of the "=". Then run zdb -dddd storage-pool <object_num> and you'll see the pool configuration for the particular class of device.

@prometheanfire
Copy link
Contributor Author

thanks, it's fixed now, but I'll keep that in mind next time (hopefully no next time though).

My question then is why, even if the disks were reordered, did zfs not use the correct disks in the correct places? It had access to all the correct disks and could have matched their function to guid.

@behlendorf
Copy link
Contributor

@prometheanfire The issue is that we need to open the devices by path under Linux. There are provisions to store an alternate path so we could store the by-id path but currently we don't. We could also try to automatically reconstruct the pool, this of course only happens manually now when you use the -d option.

@behlendorf behlendorf added this to the 0.6.4 milestone Mar 3, 2014
@ryao
Copy link
Contributor

ryao commented Mar 3, 2014

@dweeezil This will become significantly easier when I revise #2012 so that it can be merged.

@behlendorf behlendorf removed this from the 0.6.4 milestone Oct 30, 2014
@behlendorf
Copy link
Contributor

I'm closing this issue out because the original problem was resolved. However, we should open a new issue to get the alternate paths implemented in the vdev label. This would improve our resilience when devices are reordered.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Documentation Indicates a requested change to the documentation
Projects
None yet
Development

No branches or pull requests

4 participants