Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ostree repo commit: Speed up composing trees with --tree=ref #1643

Closed
wants to merge 3 commits into from

Conversation

wmanley
Copy link
Member

@wmanley wmanley commented Jun 25, 2018

TODO

Description

Running ostree commit --tree=ref=a --tree=dir=b involves reading the
whole of a into an OstreeMutableTree before composing b on top. This
is inefficient if a is a complete rootfs and b is just touching one file.
We process O(size of a + size of b) directories rather than
O(number of touched dirs).

This commit makes ostree commit more efficient at composing multiple
directories together. With ostree_mutable_tree_fill_empty_from_dirtree
we create a lazy OstreeMutableTree which only reads the underlying
information from disk when needed. We don't need to read all the
subdirectories just to get the checksum of a tree already checked into the
repo.

This provides great speedups when composing a rootfs out of multiple other
rootfs as we do in our build system. We compose multiple containers
together with:

ostree commit --tree=ref=base-rootfs --tree=ref=container1 --tree=ref=container2

and it is much faster now.

As a test I ran

time ostree --repo=... commit --orphan --tree=ref=big-rootfs --tree=dir=modified_etc

Where modified_etc contained a modified sudoers file under /etc. I used
strace to count syscalls and I seperatly took timing measurements. To
test with a cold cache I ran

sync && echo 3 | sudo tee /proc/sys/vm/drop_caches

Results:

Before After
Time (cold cache) 8.1s 0.12s
Time (warm cache) 3.7s 0.08s
openat calls 53589 246
fgetxattr calls 78916 0

I'm not sure if this will have some negative interaction with the
_ostree_repo_commit_modifier_apply which is short-circuited here. I
think it was disabled for --tree=ref= anyway, but I'm not certain. All
the tests pass anyway.

I originally implemented this in terms of the OstreeRepoFile APIs, but
it was way less efficient, opening and reading many files unnecessarily.

@papr-bot
Copy link

Can one of the admins verify this patch?

}

/**
* ostree_mutable_tree_fill_empty_from_dirtree:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly a better name would be ostree_mutable_tree_fast_merge? It's more intention revealing. Or perhaps this should be moved into a -private.h header and not be made part of the public API?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I would agree with that (keeping it private for now).

@jlebon
Copy link
Member

jlebon commented Jun 25, 2018

bot, add author to whitelist

Copy link
Member

@jlebon jlebon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! Let's start by splitting out the first three commits in a prep PR? I did a quick review for those already.

$OSTREE commit ${COMMIT_ARGS} -b test3-ref-ref -s "A" --tree=ref=test3-C1 --tree=ref=test3-D
$OSTREE commit ${COMMIT_ARGS} -b test3-dir-ref -s "A" --tree=dir=tree-C --tree=ref=test3-D
$OSTREE commit ${COMMIT_ARGS} -b test3-ref-dir -s "A" --tree=ref=test3-C1 --tree=dir=tree-D
$OSTREE commit ${COMMIT_ARGS} -b test3-dir-dir -s "A" --tree=dir=tree-C --tree=dir=tree-D
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor/optional: the -s switch is optional nowadays.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #1645

@@ -450,16 +450,20 @@ echo sudoers2 >tree-C/etc/sudoers
$OSTREE commit ${COMMIT_ARGS} -b test3-C2 -s "C" --tree=dir=tree-C
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this whole hunk be part of the previous commit where we added this test?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, fixed in #1645

/* This is the checksum of the Dirtree object that corresponds to the current
* contents of this directory. contents_checksum can be NULL if the SHA was
* never calculated or contents of the mtree has been modified. Even if
* contents_checksum is not NULL it may be out of date. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if contents_checksum is not NULL it may be out of date.

Is that true though? We invalidate it before it gets out of date, no?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, see the comment in ostree_mutable_tree_get_contents_checksum:

/* Ensure the cache is valid; this implementation is a bit
* lame in that we walk the whole tree every time this
* getter is called; a better approach would be to invalidate
* all of the parents whenever a child is modified.
*
* However, we only call this function once right now.
*/
GLNX_HASH_TABLE_FOREACH_V (self->subdirs, OstreeMutableTree*, subdir)
{
if (!ostree_mutable_tree_get_contents_checksum (subdir))
{
g_free (self->contents_checksum);
self->contents_checksum = NULL;
return NULL;
}
}

We do invalidate it if this mtree is modified, but not if subdirectories of this mtree are modified.

In state == LAZY it is always accurate.

@wmanley
Copy link
Member Author

wmanley commented Jun 25, 2018

I've been doing some more fiddling with this and I think there is a bug in here still. I'm trying to create a minimal testcase that reproduces the symptom.

@cgwalters
Copy link
Member

See #1569 - I just did a bit of digging and added a commit link there.

@wmanley
Copy link
Member Author

wmanley commented Jun 25, 2018

I've figured out what the bug is: ostree_mutable_tree_get_contents_checksum expects that one of it's subdirs' contents_checksum will be NULL if it's not valid. This isn't the case for LAZY mtrees, because we're certain of the checksums.

@rh-atomic-bot
Copy link

☔ The latest upstream changes (presumably 5d031ae) made this pull request unmergeable. Please resolve the merge conflicts.

self->state = MTREE_STATE_LAZY;
g_set_object (&self->repo, repo);
set_string (&self->contents_checksum, contents_checksum);
set_string (&self->metadata_checksum, metadata_checksum);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I need to invalidate the parent checksum here. Or have a LAZY tree that (publically) has a NULL checksum.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep - this was the issue. I've fixed it now.

@wmanley wmanley force-pushed the lazy-mtree branch 2 times, most recently from 61380e2 to a8f24d0 Compare June 26, 2018 00:19
@wmanley
Copy link
Member Author

wmanley commented Jun 26, 2018

Ok, tests are passing now, including the new test I wrote for the bug I found in my previous implementation. The diff is a little more complex now that I've had to add the parent pointer, but I think it all makes sense.

Copy link
Member

@jlebon jlebon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do the same thing again and split the first two commits (1c1be48 and 3c091e4) into a separate PR? Those are 👍 to me (with a few comments).

The lazy stuff... it looks sane at a high-level, though I'd like to play around some more to fully contextualize it.

* a child from within this file (see insert_child_mtree). We ensure that the
* parent pointer is either valid or NULL because when the parent is destroyed
* it sets parent = NULL on all its children (see remove_child_mtree) */
OstreeMutableTree *parent;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like it might be a good fit for weak refs: https://developer.gnome.org/gobject/stable/gobject-The-Base-Object-Type.html#g-object-add-weak-pointer. So then your hash table value free function can just be g_object_unref.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did consider this. I figured it's not worth the overhead both in the source and at runtime. The difference would be:

  • Atomic instructions would be used for the weak refcnt. It isn't threadsafe to use different OstreeMutableTree instances that are part of the same hierarchy from different threads anyway as we have no synchronisation around contents_checksum. We could add more synchronisation here but it hardly seems worth it.
  • The (immediate) parents would not be freed until the children had been destroyed: the weak pointer still needs to point to somewhere to check the weak refcnt (assuming it's similar to C++'s shared_ptr with make_shared)

They're minor points but I think the current implementation is nicer and we avoid a bunch of error handling for codepaths that will never be taken. I'm not wed to the current implementation so I can change it to use weak_refs if you'd like.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a quick comment, I haven't fully reviewed this patch yet.

Yeah, GLib's weak refs use a global mutex. I don't know if that really matters here (probably not) - bigger perf wins would probably be having a more efficient string-tree structure.

Having non-owning pointers for trees is pretty standard I believe.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just looked at Rust's BTree for example, and there the parent pointer is a raw pointer. It kind of has to be in Rust of course, otherwise it'd be cyclic.

break;

g_free (self->contents_checksum);
self->contents_checksum = NULL;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two can be g_clear_pointer (&self->contents_checksum, g_free).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed (on #1655).

@@ -303,9 +343,10 @@ ostree_mutable_tree_ensure_parent_dirs (OstreeMutableTree *self,
next = g_hash_table_lookup (subdir->subdirs, name);
if (!next)
{
invalidate_contents_checksum (subdir);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be cleaner to just put this call in insert_child_mtree?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't do that because we call insert_child_mtree in make_whole where we're not invalidating any checksums.

if (g_strcmp0 (checksum, self->metadata_checksum) == 0)
return;

invalidate_contents_checksum (self->parent);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice fix!

/* We're not empty - can't convert to a LAZY tree */
return FALSE;
default:
g_return_val_if_reached (FALSE);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually just do g_assert_not_reached () for these.

}

/**
* ostree_mutable_tree_fill_empty_from_dirtree:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I would agree with that (keeping it private for now).

@wmanley
Copy link
Member Author

wmanley commented Jun 27, 2018

Let's do the same thing again and split the first two commits (1c1be48 and 3c091e4) into a separate PR? Those are +1 to me (with a few comments).

Raised #1655

@rh-atomic-bot
Copy link

☔ The latest upstream changes (presumably 488365f) made this pull request unmergeable. Please resolve the merge conflicts.

Running `ostree commit --tree=ref=a --tree=dir=b` involves reading the
whole of a into an `OstreeMutableTree` before composing `b` on top.  This
is inefficient if a is a complete rootfs and b is just touching one file.
We process O(size of a + size of b) directories rather than
O(number of touched dirs).

This commit makes `ostree commit` more efficient at composing multiple
directories together.  With `ostree_mutable_tree_fill_empty_from_dirtree`
we create a lazy `OstreeMutableTree` which only reads the underlying
information from disk when needed.  We don't need to read all the
subdirectories just to get the checksum of a tree already checked into the
repo.

This provides great speedups when composing a rootfs out of multiple other
rootfs as we do in our build system.  We compose multiple containers
together with:

    ostree commit --tree=ref=base-rootfs --tree=ref=container1 --tree=ref=container2

and it is much faster now.

As a test I ran

    time ostree --repo=... commit --orphan --tree=ref=big-rootfs --tree=dir=modified_etc

Where modified_etc contained a modified sudoers file under /etc.  I used
`strace` to count syscalls and I seperatly took timing measurements.  To
test with a cold cache I ran

    sync && echo 3 | sudo tee /proc/sys/vm/drop_caches

Results:

|                      | Before | After |
| -------------------- | ------ | ----- |
| Time (cold cache)    |   8.1s | 0.12s |
| Time (warm cache)    |   3.7s | 0.08s |
| `openat` calls       |  53589 |   246 |
| `fgetxattr` calls    |  78916 |     0 |

I'm not sure if this will have some negative interaction with the
`_ostree_repo_commit_modifier_apply` which is short-circuited here.  I
think it was disabled for `--tree=ref=` anyway, but I'm not certain.  All
the tests pass anyway.

I originally implemented this in terms of the `OstreeRepoFile` APIs, but
it was *way* less efficient, opening and reading many files unnecessarily.
@wmanley
Copy link
Member Author

wmanley commented Jul 2, 2018

I've rebased (and squashed) this since #1655 was merged.

Copy link
Member

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave this a quick scan. The high level idea makes sense.

Can you add some unit tests for this patch too? Or is that too hard to do without lowering the higher level usage too? If that's the case I'd probably prefer having the changes as a single PR.

/* ======== Valid for state LAZY: =========== */

/* The repo so we can look up the checksums. */
OstreeRepo *repo;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's going to result in a lot of refs on the repo. 🤔(In Rust we'd just have the mutable tree have a lifetime bounded by that of the repo)

But eh, it's fine for now, if we ever profile and find this is an issue we can perhaps just have the ref be maintained by the toplevel mutable tree, and children chase up to the root.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's going to result in a lot of refs on the repo.

Not too many I think. There's only one per LAZY OstreeMutableTree. This means that it's only leaf directories that have refs, which initially is only 1 directory (the root).

I agree that it's a shame to churn the repo refcount, but it's unlikely to be worth fixing.


/* _ostree_mutable_tree_make_whole can fail if state == MTREE_STATE_LAZY, but
* we have getters that preceed the existence of MTREE_STATE_LAZY which can't
* return errors. So instead this function will fail and print a warning. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. One approach I've taken in other places for things like this is to cache the error, changing the object into an "error state". Have all API functions be "noops", and return the error at the next opportunity.

We might also be able to add a new API to check for and consume that error?

Not a hard requirement right now though.

@wmanley
Copy link
Member Author

wmanley commented Jul 5, 2018

Can you add some unit tests for this patch too? Or is that too hard to do without lowering the higher level usage too? If that's the case I'd probably prefer having the changes as a single PR.

What did you have in mind? I added some unit tests for this functionality in 36f7a6d which was already merged as part of #1645.

I suppose I could write some tests in C which explicitly call ostree_mutable_tree_new_from_checksum and ostree_mutable_tree_fill_empty_from_dirtree?

@cgwalters
Copy link
Member

What did you have in mind? I added some unit tests for this functionality in 36f7a6d which was already merged as part of #1645.

Let's back up a second here; in this scenario you mention in the commit message:

ostree commit --tree=ref=base-rootfs --tree=ref=container1 --tree=ref=container2

Are the trees disjoint? For example base-rootfs looks like a traditional system with an empty /usr/share/containers directory, then container1 looks like /usr/share/containers/container1/usr/*files here*, container2 looks like /usr/share/containers/container2/usr/*files here*, etc.

If I'm understanding the optimization here, it's really predicated on that.

Now...I'm staring at that test case code and only sort of understanding it. It looks like it's more testing combinations of --tree=dir and --tree=ref rather than e.g. disjointness of --tree=ref, right?

@cgwalters
Copy link
Member

(Also I pushed a fixup for the error handling, let me know what you think)

@cgwalters
Copy link
Member

cgwalters commented Jul 5, 2018

To rephrase some of the musings from #1569 - in your use case (as well as the promotion case), it feels like we basically want to make it a fatal error if we're not hitting the fast path right?

I completely get that we need to support mixing --tree=ref and --tree=dir because we initially exposed such as flexible API. Hmm. So in the above issue I proposed a --from-ref-contents which would be like --tree=ref, but you actually need a generalization of that.

How about something like this:

ostree commit --disjoint-union --ref=base-os:/ --ref=container1:/usr/share/containers/container1 --ref=container2:/usr/share/containers/container2

The args to --ref are REF:SUBPATH.

--disjoint-union would appear in the API as OSTREE_REPO_COMMIT_MODIFIER_FLAGS_REF_DISJOINT_UNION or so? And it would be an error if non-ref sources were provided, any other commit modifiers were specified, etc.

One (IMO) nicer thing about this is that the path prefix is specified at commit time, so one can have the container1 tree just be rooted at / if that makes sense? Makes it nicer to ostree ls them for example.

And further it would be a fatal error if the refs overlapped at all.

--from-ref-contents foo would just be sugar for --disjoint-union --ref=foo:/.

And if you didn't want to change the "schema" for your existing containers refs, that'd just be a matter of using --ref=container1:/

@wmanley
Copy link
Member Author

wmanley commented Jul 9, 2018

Are the trees disjoint? For example base-rootfs looks like a traditional system with an empty /usr/share/containers directory, then container1 looks like /usr/share/containers/container1/usr/*files here*, container2 looks like /usr/share/containers/container2/usr/*files here*, etc.

That's right, this is what our use-case looks like.

If I'm understanding the optimization here, it's really predicated on that.

The optimisation is more general than this and not limited to this use-case. For example: lets say you have a rootfs and want to add dash to it. Maybe the dash package looks like:

/usr/bin/dash
/usr/share/doc/dash/README
/usr/share/man/man1/dash.1.gz
/usr/share/man/man1/sh.1.gz

so then we do `ostree commit --tree=ref=rootfs --tree=ref=dash

With the new implementation we only need to read the following dirtrees:

rootfs:/
rootfs:/usr
rootfs:/usr/bin
rootfs:/usr/share
rootfs:/usr/share/doc
rootfs:/usr/share/man
rootfs:/usr/share/man/man1
dash:/
dash:/usr
dash:/usr/bin
dash:/usr/share
dash:/usr/share/doc
dash:/usr/share/man
dash:/usr/share/man/man1

to produce a complete merged rootfs. That's 14 dirtree files. This is optimal taking advantage of the fact that ostree is implemented in terms of trees.

On my laptop I have 48417 directories under /usr. Thats 48417 dirtrees that wouldn't need to be read, including a load under /usr/share/icons, /usr/share/texlive, /usr/share/perl, all of which are irrelevant for adding a dash package to this rootfs. So this optimisation is more general than just for disjoint refs and I think more generally useful.

$ find /usr -type d -xdev | wc -l
48417

There is another orthogonal way in which this implementation is faster too. I discovered using strace that we were reading all the .file and .dirmeta files of each tree we looked at. I assume this was caused by the GLib directory iteration code calling stat on each returned file. This implementation decodes the dirtree GVariants itself directly so is more explicit and can be more careful about not doing more work than necessary.

Now...I'm staring at that test case code and only sort of understanding it. It looks like it's more testing combinations of --tree=dir and --tree=ref rather than e.g. disjointness of --tree=ref, right?

The intention was to test this code by comparing the results of using --tree=ref with an implementation we already have confidence in - the plain --tree=dir approach. This implementation will used and tested by those tests.

To rephrase some of the musings from #1569 - in your use case (as well as the promotion case), it feels like we basically want to make it a fatal error if we're not hitting the fast path right?

I don't think so. I consider this optimisation more general than that.

I completely get that we need to support mixing --tree=ref and --tree=dir because we initially exposed such as flexible API. Hmm. So in the above issue I proposed a --from-ref-contents which would be like --tree=ref, but you actually need a generalization of that.

How about something like this:

ostree commit --disjoint-union --ref=base-os:/ --ref=container1:/usr/share/containers/container1 --ref=container2:/usr/share/containers/container2

The args to --ref are REF:SUBPATH.

This is similar to my implementation in #1651, but with a different API. With #1651 this would look like:

ostree commit --tree=ref=base-os \
    --tree=prefix=/usr/share/containers/container1 --tree=ref=container1 \
    --tree=prefix=/usr/share/containers/container2 --tree=ref=container2

I had considered the REF:SUBPATH API but I thought that would be a more natural API for selecting subtrees of REF rather than specifying where the ref should be "mounted". This is the API implemented in #1658.

The other issue with the REF:SUBPATH API is that you can't also apply it to --tree=dir=FILENAME:SUBPATH because FILENAME may contain a : so you can't disambiguate.

The --tree=prefix= is a bit ugly. I implemented it this way so GOption would preserve the order of the prefixes WRT the --tree options.

--disjoint-union would appear in the API as OSTREE_REPO_COMMIT_MODIFIER_FLAGS_REF_DISJOINT_UNION or so? And it would be an error if non-ref sources were provided, any other commit modifiers were specified, etc.

One (IMO) nicer thing about this is that the path prefix is specified at commit time, so one can have the container1 tree just be rooted at / if that makes sense? Makes it nicer to ostree ls them for example.

When building our trees we have a lot of intermediate trees. So out build process looks like

  1. Make container rootfs mounted at /
  2. Move container rootfs to /usr/share/containers/container1
  3. Merge together different images

With all my branches merged this would look like:

container1=$(ostree commit --prefix=/usr/share/containers/container1/usr --tree=ref=container1:usr)
container2=$(ostree commit --prefix=/usr/share/containers/container2/usr --tree=ref=container2:usr)
final_rootfs=$(ostree commit --tree=ref=$rootfs --tree=ref=$container1 --tree=ref=$container2)

I've considered enhancing our buildsystem to do the merging and moving in one step, but it's not important if ostree is blazing fast at this anyway.

And further it would be a fatal error if the refs overlapped at all.

--from-ref-contents foo would just be sugar for --disjoint-union --ref=foo:/.

And if you didn't want to change the "schema" for your existing containers refs, that'd just be a matter of using --ref=container1:/

I don't see the advantage of the fatal error as compared to just being as fast as we can be in any given situation.

@cgwalters
Copy link
Member

I assume this was caused by the GLib directory iteration code calling stat on each returned file.

(That's a bug/missed-optimization in ostree-repo-file-enumerator.c though it'd be a bit messy to fix).

I don't see the advantage of the fatal error as compared to just being as fast as we can be in any given situation.

That makes sense to me! OK, I'm convinced by that. That said, I think the points I raised about not having the commit modifiers work are still valid; I have a vague worry that someone is relying on e.g. --tree=ref=blah --owner-uid 42 or whatever. But I checked flatpak-builder, BuildStream and they're writing filesystems. So does rpm-ostree. So...if we break someone's build we can revisit.

Thanks again for this really nice patch!

@rh-atomic-bot r+ 2d41891

@rh-atomic-bot
Copy link

⌛ Testing commit 2d41891 with merge c7b12a8...

@rh-atomic-bot
Copy link

☀️ Test successful - status-atomicjenkins
Approved by: cgwalters
Pushing c7b12a8 to master...

@wmanley wmanley deleted the lazy-mtree branch July 9, 2018 14:15
cgwalters added a commit to cgwalters/coreos-assembler that referenced this pull request Apr 24, 2019
I want to add some more testing of OS updates to
https://github.com/openshift/machine-config-operator/
and it's not really necessary to do a full e2e build - just
adding a dummy file to the oscontainer is going to be sufficient
to start.

Related to this, requiring privileges (like `compose tree`) does
would be a huge ergonomic hit.  This script aims to improve
the code that landed for overriding the kubelet in Origin:
https://github.com/smarterclayton/origin/blob/4de957b019aee56931b1a29af148cf64865a969b/images/os/Dockerfile

Once this lands I'll do a PR to origin to change it to use coreos-assembler.
(though that image is enormous, maybe we want explicit layering for
 things that don't need image building, also we really need to cut out
 Anaconda)

Side note: while writing this I ended up rediscovering ostreedev/ostree#1643
It was less than a year ago but that already feels like forever...
cgwalters added a commit to cgwalters/coreos-assembler that referenced this pull request Apr 25, 2019
I want to add some more testing of OS updates to
https://github.com/openshift/machine-config-operator/
and it's not really necessary to do a full e2e build - just
adding a dummy file to the oscontainer is going to be sufficient
to start.

Related to this, requiring privileges (like `compose tree`) does
would be a huge ergonomic hit.  This script aims to improve
the code that landed for overriding the kubelet in Origin:
https://github.com/smarterclayton/origin/blob/4de957b019aee56931b1a29af148cf64865a969b/images/os/Dockerfile

Once this lands I'll do a PR to origin to change it to use coreos-assembler.
(though that image is enormous, maybe we want explicit layering for
 things that don't need image building, also we really need to cut out
 Anaconda)

Side note: while writing this I ended up rediscovering ostreedev/ostree#1643
It was less than a year ago but that already feels like forever...
jlebon pushed a commit to coreos/coreos-assembler that referenced this pull request Apr 25, 2019
I want to add some more testing of OS updates to
https://github.com/openshift/machine-config-operator/
and it's not really necessary to do a full e2e build - just
adding a dummy file to the oscontainer is going to be sufficient
to start.

Related to this, requiring privileges (like `compose tree`) does
would be a huge ergonomic hit.  This script aims to improve
the code that landed for overriding the kubelet in Origin:
https://github.com/smarterclayton/origin/blob/4de957b019aee56931b1a29af148cf64865a969b/images/os/Dockerfile

Once this lands I'll do a PR to origin to change it to use coreos-assembler.
(though that image is enormous, maybe we want explicit layering for
 things that don't need image building, also we really need to cut out
 Anaconda)

Side note: while writing this I ended up rediscovering ostreedev/ostree#1643
It was less than a year ago but that already feels like forever...
ashcrow pushed a commit to ashcrow/coreos-assembler that referenced this pull request May 30, 2019
I want to add some more testing of OS updates to
https://github.com/openshift/machine-config-operator/
and it's not really necessary to do a full e2e build - just
adding a dummy file to the oscontainer is going to be sufficient
to start.

Related to this, requiring privileges (like `compose tree`) does
would be a huge ergonomic hit.  This script aims to improve
the code that landed for overriding the kubelet in Origin:
https://github.com/smarterclayton/origin/blob/4de957b019aee56931b1a29af148cf64865a969b/images/os/Dockerfile

Once this lands I'll do a PR to origin to change it to use coreos-assembler.
(though that image is enormous, maybe we want explicit layering for
 things that don't need image building, also we really need to cut out
 Anaconda)

Side note: while writing this I ended up rediscovering ostreedev/ostree#1643
It was less than a year ago but that already feels like forever...
cgwalters added a commit to cgwalters/ostree that referenced this pull request Apr 6, 2020
I was trying to followup the `--selinux-policy-from-base` work
to add a `cosa build --fast=overlay` for coreos-assembler,
but hit on the fact that using e.g. `--owner-uid` disables
commit optimizations.

A while ago, ostreedev#1643 landed
which optimized this for the case where no modifications are provided.
But, we really need the SELinux policy bits, and it's super convenient
to run `ostree commit` as non-root.

It's fairly surprising actually that it's taken us so long to
iterate on a good interface for this "commit changes on top of a base"
model.  In practice, many nontrivial cases really end up needing
to do a (hardlink) checkout, and that case is optimized.

But for this coreos-assembler work I want to directly overlay onto
a commit object another commit object.

That previous PR above added exactly the API we need, so let's
expose it in the CLI.

What you can see happening in the test is that we provide
`--owner-uid 42`, but that only applies to directories/files
that were added in the commit.

And now that I look at this, I think what we really want here
is to avoid changing directories that exist in the base, but
eh; in practice the main use here is for `--owner-uid 0` while
committing as non-root; and that works fine with this since
the baseline uid will be zero as well.
cgwalters added a commit to cgwalters/ostree that referenced this pull request Apr 6, 2020
I was trying to followup the `--selinux-policy-from-base` work
to add a `cosa build --fast=overlay` for coreos-assembler,
but hit on the fact that using e.g. `--owner-uid` disables
commit optimizations.

A while ago, ostreedev#1643 landed
which optimized this for the case where no modifications are provided.
But, we really need the SELinux policy bits, and it's super convenient
to run `ostree commit` as non-root.

It's fairly surprising actually that it's taken us so long to
iterate on a good interface for this "commit changes on top of a base"
model.  In practice, many nontrivial cases really end up needing
to do a (hardlink) checkout, and that case is optimized.

But for this coreos-assembler work I want to directly overlay onto
a commit object another commit object.

That previous PR above added exactly the API we need, so let's
expose it in the CLI.

What you can see happening in the test is that we provide
`--owner-uid 42`, but that only applies to directories/files
that were added in the commit.

And now that I look at this, I think what we really want here
is to avoid changing directories that exist in the base, but
eh; in practice the main use here is for `--owner-uid 0` while
committing as non-root; and that works fine with this since
the baseline uid will be zero as well.
cgwalters added a commit to cgwalters/ostree that referenced this pull request Apr 6, 2020
I was trying to followup the `--selinux-policy-from-base` work
to add a `cosa build --fast=overlay` for coreos-assembler,
but hit on the fact that using e.g. `--owner-uid` disables
commit optimizations.

A while ago, ostreedev#1643 landed
which optimized this for the case where no modifications are provided.
But, we really need the SELinux policy bits, and it's super convenient
to run `ostree commit` as non-root.

It's fairly surprising actually that it's taken us so long to
iterate on a good interface for this "commit changes on top of a base"
model.  In practice, many nontrivial cases really end up needing
to do a (hardlink) checkout, and that case is optimized.

But for this coreos-assembler work I want to directly overlay onto
a commit object another commit object.

That previous PR above added exactly the API we need, so let's
expose it in the CLI.

What you can see happening in the test is that we provide
`--owner-uid 42`, but that only applies to directories/files
that were added in the commit.

And now that I look at this, I think what we really want here
is to avoid changing directories that exist in the base, but
eh; in practice the main use here is for `--owner-uid 0` while
committing as non-root; and that works fine with this since
the baseline uid will be zero as well.
agners pushed a commit to agners/ostree that referenced this pull request May 29, 2020
I was trying to followup the `--selinux-policy-from-base` work
to add a `cosa build --fast=overlay` for coreos-assembler,
but hit on the fact that using e.g. `--owner-uid` disables
commit optimizations.

A while ago, ostreedev#1643 landed
which optimized this for the case where no modifications are provided.
But, we really need the SELinux policy bits, and it's super convenient
to run `ostree commit` as non-root.

It's fairly surprising actually that it's taken us so long to
iterate on a good interface for this "commit changes on top of a base"
model.  In practice, many nontrivial cases really end up needing
to do a (hardlink) checkout, and that case is optimized.

But for this coreos-assembler work I want to directly overlay onto
a commit object another commit object.

That previous PR above added exactly the API we need, so let's
expose it in the CLI.

What you can see happening in the test is that we provide
`--owner-uid 42`, but that only applies to directories/files
that were added in the commit.

And now that I look at this, I think what we really want here
is to avoid changing directories that exist in the base, but
eh; in practice the main use here is for `--owner-uid 0` while
committing as non-root; and that works fine with this since
the baseline uid will be zero as well.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants