Skip to content
This repository has been archived by the owner on Feb 8, 2021. It is now read-only.

add new StorageDriver: rawblock #522

Merged
merged 1 commit into from
Mar 2, 2017
Merged

add new StorageDriver: rawblock #522

merged 1 commit into from
Mar 2, 2017

Conversation

laijs
Copy link
Contributor

@laijs laijs commented Mar 1, 2017

The driver creates raw-file-block for every images/valume.
It likes vfs driver in the operation creating an image from
the parent. Both drivers do the operation by copying, but
rawblock driver tries to copy the parent by reflink.
So the copy-on-write is supported for the rawblock driver
if the backing(host) filesystem supports copy-on-write.

If the backing filesystem is btrfs, rawblock driver
supports copy-on-write now. If the backing filesystem
is xfs, the rawblock driver will test whether it is
reflink-enabled, if true, rawblock driver supports
copy-on-write.

rawblock driver vs vfs driver:
  hypervisor handles block better than dir(9p, sharefs).
  rawblock driver supports copy-on-write

rawblock driver vs devicemapper:
  loopback-based devicemapper is too ugly.
  rawblock driver (will) use loopback only when manipulate
    the filesystem in host.

The rawblock driver is also the base feature prepared for
dax support and pages-shared across VMs.

Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>

@laijs
Copy link
Contributor Author

laijs commented Mar 1, 2017

# hyperctl info
Images: 4
Containers: 0
PODs: 1
Storage Driver: rawblock
  Backing Filesystem: xfs
  Support Copy-On-Write: false
  Block Filesystem: xfs
  Block Size: 10GB
...

}

func (s *RawBlockStorage) RemoveVolume(podId string, record []byte) error {
return nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove the volume file here? Or is it created in the Pod dir?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the code was copied from other drivers. but it seems the volume file should be removed.

}

func (s *RawBlockStorage) InjectFile(src io.Reader, mountId, target, baseDir string, perm, uid, gid int) error {
return fmt.Errorf("unsupported")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hyperd inject resolv.conf file with InjectFile by default. And we have inject file cases in CI scripts. I think we need this to be implemented finally.

// check if they are running over btrfs or xfs
switch fsMagic {
case graphdriver.FsMagicBtrfs: // support
case graphdriver.FsMagicXfs: // TODO: check support(201609) and use ioctl(FICLONE) instead of cp --reflink
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commit supports btrfs only, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it supports all host filesystem. only btrfs supports cow. xfs supports cow soon.


options = joinMountOptions(options, label.FormatMountLabel("", mountLabel))

//if err := mount.Mount(d.block(id), d.mnt(id), d.blockFs, options); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dose this work? or just use exec for debug?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to add loopback related code when use mount.Mount(). using mount commandline is much simpler. The 'Put()' function forgot to detach the loopdev. it needs to be fixed.

return d.createBaseBlock(id, mountLabel)
}
// TODO: use syscall directly, cp --reflink supports btrfs only currently
if out, err := exec.Command("cp", "-a", "--reflink=auto", d.block(parent), d.block(id)).CombinedOutput(); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this failed if it is not on btrfs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"--reflink=auto", it always tries with ioctl(BTRFS_IOC_CLONE) and will fallback to normal copy(read-write) on non-btrfs. so we need to use directly syscall for xfs.

ioctl(4, BTRFS_IOC_CLONE, 0x3)          = -1 ENOTTY (Inappropriate ioctl for device)
fadvise64(3, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
read(3, "..."..., 65536) = 252
write(4, "..."..., 252) = 252

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Thank you

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTRFS_IOC_CLONE has been pulled to generic vfs layer as FICLONE, so ioctl(BTRFS_IOC_CLONE) should work for all file systems that support file clone.

}
spec.Source = block
spec.Format = "xfs"
spec.Fstype = "raw"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Format="raw" && Fstype ="xfs"? Also set spec.Fstype according to rawblockDriver.blockFs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rawblockDriver.blockFs can't be fetched here, we can use 'probefs()' like dm.


func (s *RawBlockStorage) CreateVolume(podId string, spec *apitypes.UserVolume) error {
block := filepath.Join(s.RootPath(), "volumes", fmt.Sprintf("%s-%s", podId, spec.Name))
if out, err := exec.Command("truncate", fmt.Sprintf("--size=%d", storage.DEFAULT_DM_VOL_SIZE), block).CombinedOutput(); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

call rawblockDriver.CreateVolume() here? otherwise need mkfs?

func (*RawBlockStorage) CleanUp() error { return nil }

func (s *RawBlockStorage) PrepareContainer(containerId, sharedDir string) (*runv.VolumeDescription, error) {
devFullName := filepath.Join(s.RootPath(), "blocks", containerId)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a separator before containerId, like "blocks_"?

return d.createBaseBlock(id, mountLabel)
}
// TODO: use syscall directly, cp --reflink supports btrfs only currently
if out, err := exec.Command("cp", "-a", "--reflink=auto", d.block(parent), d.block(id)).CombinedOutput(); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTRFS_IOC_CLONE has been pulled to generic vfs layer as FICLONE, so ioctl(BTRFS_IOC_CLONE) should work for all file systems that support file clone.

func (d *Driver) Get(id, mountLabel string) (string, error) {
d.Lock()
defer d.Unlock()
active := d.active[id]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

d.active does not seem reliable upon daemon restart. e.g., after daemon restarts, d.active[id] is zero and damon tries to mount the device again, which will fail if a device is mounted before.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly we can use xattr to save the refcount in the raw block file, and retrieve it in Get()/Put() when d.active[id] is zero.

@laijs laijs force-pushed the rawblock branch 2 times, most recently from 54b0074 to 5e39011 Compare March 2, 2017 08:00
The driver creates raw-file-block for every images/valume.
It likes vfs driver in the operation creating an image from
the parent. Both drivers do the operation by copying, but
rawblock driver tries to copy the parent by reflink.
So the copy-on-write is supported for the rawblock driver
if the backing(host) filesystem supports copy-on-write.

If the backing filesystem is btrfs, rawblock driver
supports copy-on-write now. If the backing filesystem
is xfs, the rawblock driver will test whether it is
reflink-enabled, if true, rawblock driver supports
copy-on-write.

rawblock driver vs vfs driver:
  hypervisor handles block better than dir(9p, sharefs).
  rawblock driver supports copy-on-write

rawblock driver vs devicemapper:
  loopback-based devicemapper is too ugly.
  rawblock driver (will) use loopback only when manipulate
    the filesystem in host.

The rawblock driver is also the base feature prepared for
dax support and pages-shared across VMs.

Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com>
@bergwolf
Copy link
Member

bergwolf commented Mar 2, 2017

lgtm!

@gnawux
Copy link
Member

gnawux commented Mar 2, 2017

travis build failed, looks like download lvm source failed.

@gnawux
Copy link
Member

gnawux commented Mar 2, 2017

@laijs we need change the lvm download link to: ftp://sources.redhat.com/pub/lvm2/LVM2.2.02.131.tgz

or newer version:
ftp://sources.redhat.com/pub/lvm2/LVM2.2.02.168.tgz

The fedora site has retired.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants