-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add --measure-time flag to libtest #64714
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @Centril (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
is this different to #64663 |
Oh... Well, I started implementing this before that PR was submitted :c However, sort of yes. This PR does (and linked does not):
|
|
||
let time_str = format!(" <{}.{} s>", duration.as_secs(), duration.subsec_millis()); | ||
match &self.test_time_params { | ||
Some(params) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use guards here instead so we can fuse the .write_plain
s.
@@ -391,6 +392,7 @@ fn optgroups() -> getopts::Options { | |||
.optflag("", "exclude-should-panic", "Excludes tests marked as should_panic") | |||
.optflag("", "test", "Run tests and not benchmarks") | |||
.optflag("", "bench", "Run benchmarks instead of tests") | |||
.optflag("", "measure-time", "Measure tests execution time") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.optflag("", "measure-time", "Measure tests execution time") | |
.optflag("", "measure-time", "Measure test execution time") |
let test_threads = match matches.opt_str("test-threads") { | ||
Some(n_str) => match n_str.parse::<usize>() { | ||
Ok(0) => return Some(Err("argument for --test-threads must not be 0".to_string())), | ||
Ok(n) => Some(n), | ||
Ok(n) => { | ||
if measure_time && n != 1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A guard would be better here.
Ok(n) => Some(n), | ||
Ok(n) => { | ||
if measure_time && n != 1 { | ||
return Some(Err("The \"measure-time\" flag excepts only one thread".into())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return Some(Err("The \"measure-time\" flag excepts only one thread".into())); | |
return Some(Err("The \"measure-time\" flag is only compatible with one thread".into())); |
@@ -600,7 +615,14 @@ pub fn parse_opts(args: &[String]) -> Option<OptRes> { | |||
))); | |||
} | |||
}, | |||
None => None, | |||
None => { | |||
if measure_time { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same; use a guard.
@@ -1436,6 +1520,10 @@ pub fn run_test( | |||
} | |||
} | |||
|
|||
let nocapture = opts.nocapture; | |||
let measure_time = opts.measure_time; | |||
let run_test_opts = RunTestOpts { nocapture, concurrency, measure_time }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why unpack this just to pack it again?
fn calc_result( | ||
desc: &TestDesc, | ||
task_result: Result<(), Box<dyn Any + Send>>, | ||
duration: Option<Duration> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
duration: Option<Duration> | |
duration: Option<Duration>, |
@@ -203,6 +204,44 @@ fn parse_include_ignored_flag() { | |||
assert_eq!(opts.run_ignored, RunIgnored::Yes); | |||
} | |||
|
|||
#[test] | |||
fn parse_measure_time_flag() { | |||
// Provide --measure-time and --test flags. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Provide --measure-time and --test flags. | |
// Provide `--measure-time` and `--test` flags. |
assert!(opts.measure_time); | ||
assert_eq!(opts.test_threads, Some(1)); | ||
|
||
// Provide --measure-time flag without --test flag and with --bench flag. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Provide --measure-time flag without --test flag and with --bench flag. | |
// Provide `--measure-time` flag without `--test` flag and with `--bench` flag. |
let opts_error = parse_opts(&args_bench).unwrap(); | ||
opts_error.expect_err(expected_err); | ||
|
||
// Provide --measure-time flag with custom --test-threads flag. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Provide --measure-time flag with custom --test-threads flag. | |
// Provide `--measure-time` flag with custom `--test-threads` flag. |
Seems reasonable -- should this be the default perhaps? Not sure if this is feature gated or not, but I think we'd want to gate the behavior initially. |
See also #64663 which will probably be merged. Maybe this should build on that ? |
Seems like a good idea to do so yeah. |
Why is this? It should be possible to measure the time it takes to run a test independently of whether the tests are run in parallel or not.
With #64663, the test times are written to libtest json output - so if this is a real issue for real projects, that should allow them to write a simple tool that parses that and rejects test with longer runtimes using whatever metrics they want. That feels like a better solution to this problem than trying to solve this on the libtest side. |
fn write_ok_with_duration(&mut self, duration: Duration) -> io::Result<()> { | ||
self.write_pretty("ok", term::color::GREEN)?; | ||
|
||
let time_str = format!(" <{}.{} s>", duration.as_secs(), duration.subsec_millis()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If subsec_millis
returns less than three digits the output is wrong:
assert_ne!(" <1.023 s>", format!(" <{}.{} s>", 1, 23));
Hm... Ok, all the comments are seem reasonable. @gnzlbg, regarding test threads -- ok, then maybe use a following approach: If not of the |
Why do you want to force tests to run in one thread when reporting the run-time ? |
Imagine some resource acquired only synchronously (e.g. resource can be reached only after locking a static I understand that it's not a common case, but still a possible one, and I believe that compiler tools should avoid ambiguity when possible. Overall, if developers are sure that it's ok to run tests in parallel, they always will be able to provide a compiler flag for it. Or am I missing something here? |
libtest runs tests in parallel by default and it offers an option to choose the number of threads.
That's not an "ambiguity", that's printing out run-times that reflect reality. Changing the number of threads used by libtest silently when a user tries to debug these issues would be super confusing: "My tests run very slowly, let's print the run-times... oh! when I print the run-times my tests run fast, ?!?!?! (confusion)". |
Yeah, but when user notices that tests run very slow and decide to measure time, he will get N tests being reported as slow, when only the one of them is slow in fact. Thus this command won't be helping at all. IMHO, the purpose of that subcommand is to help user find the slow tests, and not just "report run-times that reflect reality". |
This is incorrect. If a test is reported as taking N seconds to run, then it is because it took N seconds to run. That's objective and correct data. Whether a tests takes N seconds to run because it does a lot of computation, or because its waiting doing nothing because some other tests holds a mutex lock, is a different issue. It appears that what you want is a libtest feature that lets you know which test is making other tests run slowly. Silently messing with the number of threads doesn't achieve that: static COUNT: AtomicUsize = AtomicUsize::new(0);
#[test] fn foo() { for _ in 0..COUNT.load(SeqCst) { println!("!"); } }
#[test] fn bar() { COUNT.write(usize::max_value(), SeqCst); } Here, even when using one thread, if |
The purpose of the other PR is to report unit test run-times. What users do with that data, is up to them. If your intent is to solve some other problem (adding some sort of functionality to libtest that points users to the tests that are causing other tests to run slowly), then it is probably better to open an issue to discuss that. The top post in this PR does not mention anything about this. |
OK, you're right. I will left test threads as they are then. |
FWIW I manually have to set the number of threads to 1 as well when running libtest to debug race conditions and similar issues (and i'm not alone), so I think it is worth to discuss somewhere whether there are things that we could be doing to make things easier for users here. |
Some thoughts: I like colored execution times, but I don't like default values. The proposed default values (500ms warn, 1000ms critical) don't fit everyone's use case. For example property based tests that take several seconds are not unusual. Running the tests with Alternatives:
|
r? @wesleywiser |
@Centril I don't really know anything about |
@wesleywiser Basically I did the code review, but I'm not best placed to judge whether the motivation of the PR is enough to land / account for the discussion. @varkor thought you'd be well placed as you've dealt with profiling and such things. |
Gotcha, that's fine. |
☔ The latest upstream changes (presumably #64663) made this pull request unmergeable. Please resolve the merge conflicts. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, I think this is a good feature addition and I don't have concerns about the overall approach. Here's the changes I'd like to see in order to land this PR:
- Suggested changes from the other reviewers should be done.
- I agree with @gnzlbg that we shouldn't be messing with the number of test threads. While the motivation is good, I think users will potentially be more confused if we don't replicate the original test environment as closely as possible when this flag is turned on.
- Since libtest: Add --report-time flag to print test execution time #64663 was landed behind an unstable flag, I think this option should also be behind the unstable flag.
I think including default values for the warning and critical times is a good idea but I think we should use different values for unit tests vs integration tests. Unit tests are supposed to be very, very fast so for those, I'd recommend 100ms as critical, 50ms as warn. For integration tests, I think the current values you have are ok. If detecting what kind of test is running is complicated, then feel free to ignore this. (Note: the Cargo book says that tests in src/*
are unit tests and tests in test/*
are integration tests.)
I'd also like to see some documentation for this flag and its configuration options in the Unstable Features page of the Cargo book.
When those three main issues are addressed, I'll happily approve this!
Sure, I'll perform the required changes as soon as I'll have time for this (hopefully this or next week). |
@popzxc Sounds good! If you decide to close this PR and open another, feel free to include
in the PR body so that I'll be assigned to it since I have the context for this. |
I don't think we have an easy way to detect this right now (IIRC, in both cases, the only thing that libtest "sees" is a list of |
Enhance report-time option ## Short overview This PR is a follow-up to a previously closed #64714 PR. ## Changes introduced by this PR * `libtest` now retrieves the type of the test within `TestDesc` (available types are: `UnitTest`, `IntegrationTest`, `DocTest`, `Unknown`). * `--report-time` subcommand of the `libtest` now supports colored output (disabled by default). * Colorized output depends on the threshold values. Default values (proposed by @wesleywiser): - For unit-tests: 50ms warn/100ms critical, - For integration-tests: 500ms warn/1000ms critical, - For doctests: same as for integration tests, - For unknown tests: `TEST_WARN_TIMEOUT_S` warn/ `TEST_WARN_TIMEOUT_S * 2` critical (it will only applied single-threaded mode, because otherwise test will be interrupted after reaching `TEST_WARN_TIMEOUT_S`). - These values can be overrided by setting environment variables (since those thresholds are somewhat constant for every project, it's more flexible to use environment variables than command line arguments). * New optional flag `--ensure-test-time` for `libtest`. With this flag applied, exectuion time limit excesss will cause test failure. ## What have not been done There was a comment that it would be nice to have an entry in the Cargo book about it. However, changes introduced by this PR (and #64663 in which `report-time` flag was added) aren't related directly to `cargo`, it's more about `libtest` itself. I'm considering that [The Unstable Book](https://doc.rust-lang.org/unstable-book/) is more appropriate place, but not sure if I'm right (and if so, how exactly it should be described). As one possible option, this PR may be merged without denoting it in the documentation, and in the next PR adding support of this feature to the `cargo` itself, I'll add a note in the Cargo book. ## Scope of this PR Logical scope of this PR is `libtest` only. However, to get test types, I had to modify also `libsyntax_ext` and `librustdoc` for them to provide information about test type. ## Rationale Rationale for colored output was submitted in #64714 Providing the information about kind of test was also proposed in #64714, and as an additional benefit this information may be useful for the tools using `libtest` (e.g. `cargo`). Adding flag to treat time limits excess seems logical to me, so projects that do care about test execution time won't have to invent a wheel. ## Backward compatibility All the changes are completely backward compatible. ## Demo ![rustc_enhanced_time](https://user-images.githubusercontent.com/12111581/65818381-c04f6800-e219-11e9-9875-322463abe24f.gif) r? @wesleywiser
Overview
This PR introduces a new option flag for
rust
test
command:measure-time
. With this flag applied,rustc
will measure the execution time of each test and print it out (into file log or ifpretty
formatter is chosen).--measure-time
is intended for a single-threaded run, so using with--test-threads
set to any value other than1
will result in an error.Time is measured only when flag is applied, thus there is no overhead for common runs.
In the output, time is written in red if test duration time >=
RUST_TEST_TIME_CRITICAL
(defaults to 1000 milliseconds), in yellow if test duration time >=RUST_TEST_TIME_WARN
(defaults to 500 milliseconds), otherwise default formatting is applied.Rationale
Usually, big Rust projects have a lot of tests. And tests are not always written carefully. As a result, if some poorly written test gets merged into a big project, it will be difficult to find that test, because there is no tool for that.
libtest
checks for timed out tests, but the constantTEST_WARN_TIMEOUT_S
is not configurable, and it's not really an instrument to detect not optimized tests.This PR allows developers to inspect their codebase to detect long-running tests and optimize/rewrite/remove them.
Backward compatibility
This PR is fully backward compatible, it does not introduce any breaking changes.
r? @Centril