Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance optimisations #10

Merged
merged 3 commits into from
Dec 5, 2024
Merged

Performance optimisations #10

merged 3 commits into from
Dec 5, 2024

Conversation

Humandoodlebug
Copy link
Contributor

@Humandoodlebug Humandoodlebug commented Nov 1, 2024

This PR consists of two commits, both of which make significant performance improvements to the parser. The first commit also makes a small API change by changing the type of FormatElement::Verbatim's field from String to &str and introducing a generic lifetime onto FormatElement.

I used the following test to benchmark the changes:

#[test]
fn perf_test() {
    let repeating_fmt_str = "test%utest%sing";
    let repeating_arg = 42;
    let repeating_arg_str = "foo";
    let fmt_str = {
        let mut s = String::new();
        for _ in 0..1000 {
            s.push_str(repeating_fmt_str);
        }
        s
    };
    let fmt_args: Vec<&dyn Printf> = {
        let mut v: Vec<&dyn Printf> = Vec::new();
        for _ in 0..1000 {
            v.push(&repeating_arg);
            v.push(&repeating_arg_str);
        }
        v
    };

    let start = std::time::Instant::now();
    for _ in 0..1000 {
        vsprintf(&fmt_str, &fmt_args).unwrap();
    }
    let end = std::time::Instant::now();
    eprintln!("took {:?}", end - start);
}

The test just formats a large string 1000 times and measures the time taken. This was just to get a rough idea of whether the changes improved performance, not to get precise impartial measurements.
I ran it like this:

$ cargo test --release perf_test -- --nocapture

On my laptop (amd64, linux, rust 1.81.0), the formatting part of the test takes ~8.8 seconds on v0.3.1. After the first commit (which is focused on reducing allocations), this comes down to ~2.7 seconds. After the second commit (which replaces the top-level recursion in the parser with a loop), this drops further to ~0.27 seconds. So overall, approximately a 30x performance increase.

The benchmark above focuses on longer format strings to try to mitigate some of the test overhead, but performance is improved even for shorter ones. Changing the benchmark above to only 2 iterations of the repeating format string and args and repeating the formatting 100,000 times gives me ~140 milliseconds on v0.3.1 and ~85 milliseconds after both commits.

This roughly triples performance.
Yields roughly a further 8x performance increase.
@tjol
Copy link
Owner

tjol commented Nov 30, 2024

Sorry it took me so long to get around to looking at this PR.

I like it. The numbers certainly sound impressive, and thanks for explaining what you've done so clearly, too!

It looks like right now there are a few places that need to be adapted to the API change for the test to build and run. If you could fix those I'll be happy to accept and release the changes.

@Humandoodlebug
Copy link
Contributor Author

Ah, I apologise; I forgot to run doctests! Should be fixed now.

@tjol tjol merged commit 54665d9 into tjol:main Dec 5, 2024
1 check passed
@tjol
Copy link
Owner

tjol commented Dec 5, 2024

Version 0.4.0 is on crates.io!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants