Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement compression in tracing-appender #1779

Open
wants to merge 24 commits into
base: master
Choose a base branch
from

Conversation

pjankiewicz
Copy link

@pjankiewicz pjankiewicz commented Dec 15, 2021

Motivation

This PR implements compression mechanism for tracing-appender crate. In our project we use tracing for event logging and they tend to be very big and compress quite well. We wanted to have a pure Rust solution to handle compression.

Solution

To overcome compatibility issues it looks at the file extension that is given by the user.

let appender = tracing_appender::rolling::minutely("/some/path", "rolling.log.gz")

will output compressed logs to /some/path/rolling.log.yyyy-MM-dd-HH-mm.gz

Note that I add non-optional dependency flate2 to Cargo.toml.

Let me know if this is ok so I can update the documentation.

Update

After @hawkw suggestion I have a new plan:

  • implement builder for RollingFileAppender
  • implement compression configuration behind a feature flag
  • update docs
  • add tests
  • code formatting
  • clippy

@pjankiewicz pjankiewicz requested a review from a team as a code owner December 15, 2021 15:02
@davidbarsky
Copy link
Member

Thanks for working on this! I'm not opposed to adding compression support to tracing-appender, but here are my initial thoughts on this PR:

  • The compression feature should be behind a cargo feature gate named after the compression mechanism used. My understanding is that flate2 supports DEFLATE, zlib, and gzip, so each of these compression mechanism have their feature flag, especially if they require binding out to a C library. We might also want to add support for formats such as zstd in the future.
  • tracing-appender shouldn't be looking at the suffix of the file to determine if/which compression mechanism is used. I think creating a builder for tracing-appender that allows the user to specify the folder, file name, and compression mechanism is probably best in this instance. The compression mechanism should probably be several consts implemented in a similar to how Level is in tracing-core—I think this is slightly easier to feature flag than an enum.
  • All compression features should be off-by-default.

I'll add a few additional comments on the code, but those are my main thoughts at this time.

@hawkw
Copy link
Member

hawkw commented Dec 16, 2021

  • tracing-appender shouldn't be looking at the suffix of the file to determine if/which compression mechanism is used. I think creating a builder for tracing-appender that allows the user to specify the folder, file name, and compression mechanism is probably best in this instance.

+1. I'd expect the filename extensions to be added automatically based on which compression algorithm was selected in the builder, rather than selecting a compression algorithm automatically based on the user adding a filename extension.

@pjankiewicz
Copy link
Author

@davidbarsky @hawkw Thank you for the feedback. I will implement at least 2 compression crates behind feature flags. About the file extensions it is quite common to use it as a selector for the compression method but I understand the need to more explicit 😄 .

The question is about the interface. There are 2 options I see:

  1. Change existing RollingFileAppender
impl RollingFileAppender 
    pub fn new(
        rotation: Rotation,
        directory: impl AsRef<Path>,
        file_name_prefix: impl AsRef<Path>,
        compression: Option<CompressionParams> // new parameter
    ) -> RollingFileAppender  { ... }
}

struct CompressionParams {
     algorithm: SomeEnum,
     levl: SomeOtherEnum
}

and then change the constructors:

  • minutely
  • hourly
  • daily
  • never

by adding Option<CompressionParams>.

  1. Implement new CompressedRollingFileAppender and minutely_compressed, hourly_compressed, daily_compressed, never_compressed

I like the second option because it is backward compatible but eventually th first one is better (maybe reserved for a major release?).

@hawkw
Copy link
Member

hawkw commented Dec 20, 2021

@pjankiewicz Hmm, regarding the API design, I think the ideal approach would be adding a new builder type for construcing a rolling appender. I would expect the API to look sort of like this:

let compressed_appender = RollingFileAppender::builder()
     .compressed(Compression::gzip().level(3)) // or whatever
     .directory("/var/log")
     .file_name_prefix("myapp-")
     .build(Rotation::Hourly);

let non_compressed_appender = let compressed_appender = RollingFileAppender::builder()
     .directory("/var/log")
     .file_name_prefix("myapp-")
     .build(Rotation::Hourly);

I think it's fine if the new constructor and existing minutely, hourly, and daily functions don't have the option to construct a compressed appender; instead, users who want to do more advanced configuration, such as adding compression, can use the new builder. This is approach is also backwards-compatible, but adds less new API surface.

Also, this has the advantage that, because the compression feature is feature-flagged, we can also feature flag the .compressed() builder method, and add a #[doc(cfg(...))] attribute so that the required feature flags are clearly shown in the documentation. If we added a compression parameter to the new constructor, we would end up with a constructor argument that is sometimes just silently ignored based on a feature flag, which seems not great, to me.

@pjankiewicz pjankiewicz changed the title Implement compression in tracing-appender WIP: Implement compression in tracing-appender Dec 23, 2021
@pjankiewicz
Copy link
Author

pjankiewicz commented Dec 23, 2021

@hawkw I managed to rewrite the PR according to your suggestions also I merged your latest changes from master. There were some hurdles with compression feature flag but I think it looks decent. I still need to update the docs, add tests and format the code.

The code will look like this:

RollingFileAppenderBuilder::new()
    .rotation(Roration::Hourly)
    .log_directory("/var/log")
    .log_filename_prefix("myapp-")
    .compression(compression_options::GZIP_BEST)
    .build()

I expect the rest will take me a few days more.

Copy link
Member

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall, this looks like the right direction! i left some suggestions, hopefully they're helpful!

tracing-appender/Cargo.toml Outdated Show resolved Hide resolved
tracing-appender/src/builder.rs Outdated Show resolved Hide resolved
tracing-appender/src/builder.rs Outdated Show resolved Hide resolved
tracing-appender/src/builder.rs Outdated Show resolved Hide resolved
tracing-appender/src/builder.rs Outdated Show resolved Hide resolved
tracing-appender/src/rolling.rs Outdated Show resolved Hide resolved
tracing-appender/src/writer.rs Outdated Show resolved Hide resolved
tracing-appender/src/compression.rs Outdated Show resolved Hide resolved
tracing-appender/src/compression.rs Outdated Show resolved Hide resolved
tracing-appender/src/builder.rs Outdated Show resolved Hide resolved
@pjankiewicz
Copy link
Author

@hawkw thank you for the detailed code review, I really appreciate it. The issues you mentioned were fixed. It turned out that IntelliJ didn't even compile the code with the feature flag enabled even though it was selected. Not it compiles with and without the compression feature.

Copy link
Member

@hawkw hawkw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall, this looks very good. i had a bunch of small suggestions, but I think this should be ready to merge soon!

tracing-appender/src/rolling.rs Outdated Show resolved Hide resolved
tracing-appender/src/compression.rs Outdated Show resolved Hide resolved
tracing-appender/src/compression.rs Outdated Show resolved Hide resolved
tracing-appender/src/lib.rs Outdated Show resolved Hide resolved
tracing-appender/src/rolling.rs Outdated Show resolved Hide resolved
tracing-appender/src/rolling.rs Outdated Show resolved Hide resolved
tracing-appender/src/writer.rs Outdated Show resolved Hide resolved
tracing-appender/src/writer.rs Outdated Show resolved Hide resolved
tracing-appender/src/writer.rs Outdated Show resolved Hide resolved
#[derive(Debug)]
pub(crate) struct CompressedGzip {
compression: CompressionConfig,
buffer: Arc<RwLock<BufWriter<GzEncoder<BufWriter<File>>>>>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few questions about this:

  • why does this need to be inside an Arc? as far as I can tell, we never clone it, so i don't think it needs to be reference counted?
  • i don't love that the RwLock here adds a second lock that has to be acquired in the write path when compression is in use, but i see why it's necessary. we probably can't get rid of that without a fairly big rewrite of how the rest of RollingFileAppender works, so this is fine for now :/
  • on that subject, I don't think this needs to be a RwLock. this is only ever locked mutably to write to it, and there's no case where multiple threads might concurrently access it immutably, so i think this should just be a Mutex, which should be a little more lightweight in cases where shared immutable access isn't needed.
  • what's the motivation for wrapping the GzEncoder and the File itself in separate BufWriters? why do we need two layers of buffering? if this is important, a comment explaining it would be nice.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ad 1. Yeah, I might treat Arc<RwLock<...>> as an idiom by now.
ad 2. I didn't see another option. I tried for many hours to have it without any sort of lock but I always stumbled on the same problem that the GzEncoder couldn't be changed only behind a mut ref.
ad 3. I will change it to Mutex.
ad 4. I will check this again. You are right that it is confusing a bit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ad 2. I didn't see another option. I tried for many hours to have it without any sort of lock but I always stumbled on the same problem that the GzEncoder couldn't be changed only behind a mut ref.

yeah, i think this is fine, since it would probably take a lot of work to work around that part. don't worry about it.

@sw-dev-code
Copy link

@pjankiewicz @hawkw What is the current status of this merge? When we can expect it to be part of the official tracing crate?

I really like and need this feature. Thank you for your effort.

@pjankiewicz
Copy link
Author

@sw-dev-code Sorry for the delay I had some other priorities. I will get back to it this week.

@sw-dev-code
Copy link

sw-dev-code commented Feb 21, 2022

@sw-dev-code Sorry for the delay I had some other priorities. I will get back to it this week.

@pjankiewicz Thank you. Your solution is the most elegant one and the only one I found available to work with the Tracing crate. My devices would have limited memory so this feature is a must.

Can I somehow use it before the final release to the Tracing crate?

@hawkw
Copy link
Member

hawkw commented Feb 21, 2022

@pjankiewicz

@sw-dev-code Sorry for the delay I had some other priorities. I will get back to it this week.

Great, I'm happy to help get this over the finish line --- I think it's pretty close to done.

@sw-dev-code

Can I somehow use it before the final release to the Tracing crate?

You'll need to backport this PR onto the v0.1.x branch, and take a git dependency on that branch. This is a bit harder than just using a Git dependency on this branch, because these changes need to be merged into v0.1.x to work with the rest of the released tracing ecosystem. However, because there aren't currently any major API changes in tracing-appender between the master and v0.1.x versions, it should be pretty easy to do the backport.

pjankiewicz and others added 4 commits February 23, 2022 14:11
Co-authored-by: Eliza Weisman <eliza@buoyant.io>
Co-authored-by: Eliza Weisman <eliza@buoyant.io>
Co-authored-by: Eliza Weisman <eliza@buoyant.io>
Co-authored-by: Eliza Weisman <eliza@buoyant.io>
@pjankiewicz
Copy link
Author

@hawkw I managed to find the issue with decoding. Before reading from the file the appender needs to be dropped to properly close the buffer.

let expected_value = "Hello";
write_to_log(&mut appender, expected_value);
drop(appender); // had to add this to read from the file
assert!(find_str_in_compressed_log(directory.path(), expected_value));

@pjankiewicz
Copy link
Author

@hawkw @sw-dev-code I think the PR is ready. I went back to our previous discussions and resolved some issues - hopefully I didn't miss anything. I checked clippy, tests, formatting etc. I can remove WIP flag right now in my opinion.

@pjankiewicz pjankiewicz changed the title WIP: Implement compression in tracing-appender Implement compression in tracing-appender Feb 23, 2022
@pjankiewicz
Copy link
Author

@davidbarsky I might have misclicked to ask you for the review in this PR. Sorry about this I think @hawkw has everything under control.

@sw-dev-code
Copy link

@pjankiewicz Thank you so much for your effort. How will I know when this implementation becomes part of the official crate?

@pjankiewicz
Copy link
Author

@pjankiewicz Thank you so much for your effort. How will I know when this implementation becomes part of the official crate?

@sw-dev-code It is not up to me. Assuming that it will be merged soon - it will be a part of some future release. As it is a change that doesn't break the current API I hope it will be pretty soon. In the meantime you could use this branch or manually apply the changes to 1.x version.

@sw-dev-code
Copy link

@pjankiewicz Ok, understood. Can you guide me on how to configure Rust to use your branch for tracing crate?

@pjankiewicz
Copy link
Author

@pjankiewicz Ok, understood. Can you guide me on how to configure Rust to use your branch for tracing crate?

It is possible to specify dependencies from git / github. See here for more information https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html#specifying-dependencies-from-git-repositories.

But I wouldn't suggest you use a specific version from any branch - it is much better to rely on stable releases of crates.

@sw-dev-code
Copy link

@pjankiewicz Thanks.

@bryangarza
Copy link
Member

(Going through all open PRs)

Seems this PR is pretty close to getting merged, added myself as reviewer for now, will try to look at it later today or tomorrow.

tracing-appender/src/rolling/builder.rs Show resolved Hide resolved
tracing-appender/src/rolling/builder.rs Outdated Show resolved Hide resolved
tracing-appender/src/rolling/compression.rs Outdated Show resolved Hide resolved
tracing-appender/src/rolling/compression.rs Outdated Show resolved Hide resolved
tracing-appender/src/rolling/compression.rs Outdated Show resolved Hide resolved
tracing-appender/src/writer.rs Show resolved Hide resolved
tracing-appender/src/writer.rs Show resolved Hide resolved
tracing-appender/src/writer.rs Show resolved Hide resolved
tracing-appender/src/writer.rs Show resolved Hide resolved
@bryangarza bryangarza self-requested a review May 12, 2022 16:17
@bryangarza
Copy link
Member

@pjankiewicz seems the CI has some failures, could you resolve those? After that we can probably merge

@bryangarza
Copy link
Member

And I think there might be some merge conflicts too

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants