Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add Benchmark for rawloader #82

Closed
wants to merge 3 commits into from
Closed

[WIP] Add Benchmark for rawloader #82

wants to merge 3 commits into from

Conversation

jhwgh1968
Copy link

@jhwgh1968 jhwgh1968 commented Apr 6, 2021

Closes #46.

I am hoping for an early review, @anp. Some notes:

  • This benchmark requires manually downloading an image file from the internet. It is Creative Commons Licensed, but cannot be downloaded by the benchmark itself due to Cloudflare. If you would prefer I commit the 18 MB file to the repo, I can.
  • This library is LGPL 2.1, as is the benchmark code I started with. I have noted this in the license key of the Cargo.toml itself. If there is somewhere else I should note it, let me know.
  • The benchmark code used to be an executable with a main() function that did its own iteration, and had no black_box. Therefore, I am hoping this version does not need a call to black_box either.
  • It is not 100% clear to me whether the latest version of the library on crates.io matches master or not, so I pinned it by git commit, and use that as the version number.
  • I am not touching anything about the runners, as per the directions.
  • I am hoping the Cargo.lock additions will not break anything, and will pin its dependency version numbers. If I am wrong, how do I fix that?
  • While I have verified the benchmark executes, I cannot figure out how to actually measure its performance on my own system to compare it with the original. cargo bench does nothing, and cargo test --release does not give me any numbers. Is this expected?

I imagine I left something out, so please let me know what. 😄

@jhwgh1968 jhwgh1968 marked this pull request as draft April 6, 2021 00:29
@bjorn3
Copy link

bjorn3 commented Apr 6, 2021

This library is LGPL 2.1, as is the benchmark code I started with. I have noted this in the license key of the Cargo.toml itself. If there is somewhere else I should note it, let me know.

Rust statically links, so anything depending on LGPL must itself be LGPL licensed.

While I have verified the benchmark executes, I cannot figure out how to actually measure its performance on my own system to compare it with the original. cargo bench does nothing, and cargo test --release does not give me any numbers. Is this expected?

From the readme:

cargo test-all runs the test for every benchmark function, which consists of warming it up and running through a couple of iterations. caution: this will generate dozens of gigabytes of data in your target directory.

@jhwgh1968
Copy link
Author

jhwgh1968 commented Apr 6, 2021

Rust statically links, so anything depending on LGPL must itself be LGPL licensed.

My non-lawyer understanding is that all 8 of the dependencies are LGPL compatible due to dual licensing under MIT. (Unfortunately Apache 2.0 is not LGPL 2 compatible.)

As a result, any user who builds this benchmark can implicitly choose MIT for the dependencies, and then be forced by the library to opt into the LGPL for that particular executable (as per Section 3, Paragraph 6). Other binaries and code are unaffected.

Given the warning about licensing in the README, and the general use case for these benchmarks, I understood this to be acceptable.

From the readme:

Oh. I didn't read that as, "this is the only way to run the tests," but rather as "don't do this unless you own a giant box and are ready for significant resource use." Thanks for the clarification.

Copy link
Owner

@anp anp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, thanks! To inform expectations, you should probably know that the bots for this project haven't been running for a bit (#68) and I'm currently very slowly moving the CI over to something that works today (#72). Landing might take a little while 😅 .

Re: LGPL, I agree that the specific benchmark binaries statically linking the code need to be LGPL as well, I think it's set up correctly for this PR. AIUI, there's nothing about LGPL that would suggest that lolbench_support would need to be relicensed to be linked into the same benchmark binary. When it comes down to it, the provenance is important but redistributing this many binaries with code cribbed from this many sources is always going to be fraught (which is fine because we mostly need data, not runnable programs to download).

If you would prefer I commit the 18 MB file to the repo, I can.

I think that's probably the right call, left a review comment.

I am hoping the Cargo.lock additions will not break anything, and will pin its dependency version numbers. If I am wrong, how do I fix that?

Don't worry too much about that right now, because the bots aren't running we can consider this an extended maintenance window :P.

benches/rawloader_c793a13/tests/decode-decode.rs Outdated Show resolved Hide resolved
benches/rawloader_c793a13/src/lib.rs Outdated Show resolved Hide resolved
lolbench_support = { path = "../../support" }

[dependencies.rawloader]
version = "*"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you leave a comment here explaining what you said in your PR description about a mismatch?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

This dependency is precisely pinned to rawloader, based on the git commit I tested. That is fine.

However, what about the library's dependencies? They are:

[dependencies]
toml = "0.5"
enumn = "0.1"
lazy_static = "1"
byteorder = "1"
rayon = "1"
itertools = "0.9"
serde = "1"
serde_derive = "1"

To my surprise, when I added this and did a fetch, it only added new things to Cargo.lock, but I am not sure if this situation will continue.

I know there is an open issue for this, #64. But I was wondering if I should do something to make the pinning tighter and avoid worsening things? Like overriding itertools to =0.9.1 for example? Or is that not necessary?

@jhwgh1968
Copy link
Author

Hey, thanks! To inform expectations, you should probably know that the bots for this project haven't been running for a bit (#68) and I'm currently very slowly moving the CI over to something that works today (#72). Landing might take a little while

Understood.

... Although I would say, progress looks pretty good. The only failing check on that PR is the (dead for a while) website.

Perhaps it would be possible to find a "good enough" CI for the time being, just to accept PRs again? I do see this referred to in Rust issues at times, so I can't be the only one who is interested in it.

When it comes down to it, the provenance is important but redistributing this many binaries with code cribbed from this many sources is always going to be fraught (which is fine because we mostly need data, not runnable programs to download).

Indeed. The "source only" nature of this project -- where redistributing any binaries someone built would mostly defeat the purpose -- is the reason I didn't consider the licensing issue a showstopper.

I think that's probably the right call, left a review comment.

Since the website died due to the size of the repo, I thought I'd be conservative and ask before bloating the repo significantly with one commit! 😄


A separate, stupider question, @anp: when I run cargo test-all -- even on master -- I get a hang:

     Running target/release/deps/bench_main-6212ce7833fa9a09
Benchmarking Fibonacci/Recursive: Warming up for 3.0000 s
Benchmarking Fibonacci/Recursive: Collecting 100 samples in estimated 5.0000 s (13B iterations)

The warm-up actually took 3 seconds, but the process burned 100% of a single CPU core on that second message for 45 minutes before I killed it.

Dumping the assembly shows a hot and tight recursive function, that, well, looks like Fibonacci. But I can't seem to see it ever changing stack frame sizes.

Worse yet, I can't even figure out how to remove it. Even after deleting the rayon benchmarks -- where there are functions with matching names -- and doing cargo clean, it still did this.

This was with both Rustc 1.36 (which the project wanted me to download) and current stable (1.51). What am I missing?

@jhwgh1968 jhwgh1968 closed this by deleting the head repository Aug 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

add rawloader
3 participants