-
Notifications
You must be signed in to change notification settings - Fork 27
lib/tar: Pre-filter tar archive in Rust #112
Conversation
e1c16da
to
2961a88
Compare
lib/src/tar/write.rs
Outdated
|
||
// Completely discard stuff outside of /usr | ||
if !(matches!(path.as_str(), "/" | "/usr") || path.starts_with("usr/")) { | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd throw a warn!()
in here. I'm pretty sure we'll discover a good bunch of nasty things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, though since this is a library we should probably allow the caller to collect this data and log/print it however it wants.
Will look at that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the bigger picture I am currently thinking of this as a shortcut - longer term we'll strongly encourage the model in #12 (comment) where we have a required RUN ostree container-finalize-bootable
incantation that will clean this stuff up at build time.
Rather than trying to extend the ostree C code to support arbitrary transformations, let's do pre-parsing here in safe Rust. We keep the `/etc` → `/usr/etc` bits, but we also just completely drop everything not in `/usr` now. As noted in the comment, this pre-validation will hopefully also catch any corrupt tarballs that might be exploitable in the C libarchive codebase.
2961a88
to
520fc40
Compare
OK, that was a fun deep dive into path libraries in Rust, specifically the components method. We now parse, normalize and filter the input path in a much more robust way, and this updated code now also returns a btreemap from "prefix -> number of filtered paths" which should be sufficient to start. |
@@ -26,6 +26,35 @@ impl<T: AsyncRead> ReadBridge<T> { | |||
} | |||
} | |||
|
|||
/// A [`std::io::Write`] implementation backed by an asynchronous source. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW I submitted tokio-rs/tokio#4146
Hmm, what we should really do here actually is store the metadata about filtered paths in the commit object. However, doing that would require splitting up the write/commit phases, which in turn really wants us to be doing all of this natively in Rust instead of forking off |
Any other comments on this, or good to go? |
All good from my side 👍. As (very subjective) feedback on your last comment, I don't personally feel the need to record that information persistently. It can be interesting to log and inspect it at runtime, but in the steady state I guess it's unlikely that we'd want to parse/analyze that from the commit object itself. |
Rather than trying to extend the ostree C code to support
arbitrary transformations, let's do pre-parsing here in safe Rust.
We keep the
/etc
→/usr/etc
bits, but we also just completelydrop everything not in
/usr
now.As noted in the comment, this pre-validation will hopefully also
catch any corrupt tarballs that might be exploitable in the C libarchive
codebase.