Use CPU time for --warn-long #36226

orlitzky · 2023-09-09T15:30:43Z

Finally address #32981.

Using wall time to measure which tests are "long" doesn't make much sense. Wall time is subject to every conflating factor:

Faster hardware runs tests in less wall time
Faster compiler flags makes code that runs in less wall time
A busy machine will run tests slower than an unburdened one
Operating system priority can increase or decrease the time taken by a process
etc.

Ideally, everyone would agree on how long is too long for a test to run. Attaining that ideal is not so easy, but the first step is to switch--warn-long away from using wall time and towards using CPU time. A second phase (#33022) will normalize the CPU time.

Examples

--warn-long output

The user-facing portion of this can be tested with a low --warn-long value,

$ sage -t --warn-long=1 src/sage/rings/qqbar.py 
...
Doctesting 1 file.
sage -t --warn-long --random-seed=309742757431591512726158538133919135994 src/sage/rings/qqbar.py
**********************************************************************
File "src/sage/rings/qqbar.py", line 511, in sage.rings.qqbar
Warning, slow doctest:
    z, = [r for r in p4.roots(QQbar, False) if r in ival]
Test ran for 1.51s cpu, 1.51s wall
Check ran for 0.00s cpu, 0.00s wall
**********************************************************************
File "src/sage/rings/qqbar.py", line 524, in sage.rings.qqbar
Warning, slow doctest:
    z2 = QQbar.polynomial_root(p4, ival)
Test ran for 1.07s cpu, 1.07s wall
Check ran for 0.00s cpu, 0.00s wall
**********************************************************************
File "src/sage/rings/qqbar.py", line 808, in sage.rings.qqbar.AlgebraicField_common._factor_multivariate_polynomial
Warning, slow doctest:
    F = QQbar._factor_multivariate_polynomial(p)
Test ran for 2.38s cpu, 2.42s wall
Check ran for 0.00s cpu, 0.00s wall

...

pexpect accounting

Two different methods are used to account for the CPU time, one for linux, and one for everything else. This GAP command uses pexpect and should chew up a good bit of CPU time:

sage: from sage.doctest.util import Timer
sage: t = Timer()
sage: _ = t.start(); t.stop()    # no CPU time used between start/stop
{'cputime': 0.0, 'walltime': 7.605552673339844e-05}

sage: _ = t.start(); gap.eval('a:=List([0..10000],i->WordAlp("a",i));; IsSortedList(a);'); t.stop()    # lots of CPU time used
'true'
{'cputime': 30.089999999999996, 'walltime': 36.292088747024536}

src/sage/doctest/util.py

src/sage/doctest/control.py

src/sage/doctest/util.py

jhpalmieri · 2023-09-27T21:10:33Z

src/sage/doctest/forker.py

@@ -1467,7 +1475,7 @@ def report_failure(self, out, test, example, got, globs):
                    self._fakeout.start_spoofing()
            return returnval

-    def report_overtime(self, out, test, example, got, *, check_duration=0):
+    def report_overtime(self, out, test, example, got, *, check_timer=None):


The old keyword should be deprecated using sage.misc.decorators.rename_keyword, rather than just changed.

The keyword wasn't simply renamed -- the type of argument that it takes was changed too. Typically that would be an unfriendly thing to do, but this is essentially a private method. No other part of the sage library uses it (it's referenced exactly once, by self), and since it's part of sage's custom doctest runner, it's hard to imagine a third party relying on these internal details. Frankly I'd like to change it to an underscore method now that you mention it.

The keyword wasn't simply renamed -- the type of argument that it takes was changed too. Typically that would be an unfriendly thing to do, but this is essentially a private method. No other part of the sage library uses it (it's referenced exactly once, by self), and since it's part of sage's custom doctest runner, it's hard to imagine a third party relying on these internal details.

Cool. Sage policy is still to deprecate it. So deprecate the old argument and add a new one.

There's no way to translate the old argument into the new one. If we keep the old name but ignore its argument, then any code using this function will still be immediately broken, only more quietly and with more steps.

If you feel that strongly about not changing this method's signature, I can copy & paste the entire function into a new private method _report_overtime. And then we should do the same for _report_success. That will leave the old methods intact to be deprecated, but I really think we would be wasting our time.

It's going to take time to put the old functions back and to add the deprecation warnings. It will take time to re-review those changes. It's going to take time to run the doctests for two unused functions in the interim. And then eventually someone has to spend the time to remove the functions, and someone else has to spend the time to review that.

Unless there's an (even remotely) plausible reason why anyone would be calling this function in their own code, it's justifiable to bend the deprecation rule. Its intent has been respected. And to do otherwise is wasting our most valuable resource.

src/sage/doctest/util.py

jhpalmieri · 2023-09-27T21:13:41Z

Not your job here, but would it be worthwhile to have a summary of overly long doctests printed at the end of doctesting?

orlitzky · 2023-09-29T02:54:01Z

Not your job here, but would it be worthwhile to have a summary of overly long doctests printed at the end of doctesting?

Yes, but my vote would be to collect all warnings into that report.

github-actions · 2023-09-29T15:05:09Z

Documentation preview for this PR (built with commit efa14c3; changes) is ready! 🎉
This preview will update shortly after each push to this PR.

orlitzky · 2024-04-17T00:50:23Z

@mkoeppe I have two modularization failures (in repl & categories) that I'm sure you know how to fix. Both fail at,

[sagemath_categories-10.4.beta3] [spkg-check]   File "/sage/pkgs/sagemath-categories/.tox/sagepython-sagewheels-nopypi-norequirements/lib/python3.10/site-packages/sage/doctest/util.py", line 221, in _quick_cputime
  [sagemath_categories-10.4.beta3] [spkg-check]     from sage.interfaces.quit import expect_objects
  [sagemath_categories-10.4.beta3] [spkg-check] ModuleNotFoundError: No module named 'sage.interfaces.quit'

orlitzky · 2024-04-17T01:06:27Z

The CI has the following test failing on Conda/macOS:

sage: from sage.doctest.util import Timer                                                                                  
sage: cputime = 0.0                                                                                                        
sage: try:                                                                                                                 
....:     cputime = Timer()._pid_cpu_seconds(1)                                                                            
....: except OSError:                                                                                                      
....:     pass                                                                                                             
sage: cputime >= 0.0                                                                                                       
True                                                                                                                       
sage: isinstance(cputime, float)                                                                                           
True

The failure is,

Failed example:
    isinstance(cputime, float)
Expected:
    True
Got:
    False

The function either raises an OSError, or should return a float divided by an integer, which I'm expecting to be a float itself. I could just cast it with return float(...), but if anyone with macOS can reproduce it, I'd rather know why I'm casting it.

tornaria · 2024-04-23T01:47:29Z

The CI has the following test failing on Conda/macOS:

sage: from sage.doctest.util import Timer                                                                                  
sage: cputime = 0.0                                                                                                        
sage: try:                                                                                                                 
....:     cputime = Timer()._pid_cpu_seconds(1)                                                                            
....: except OSError:                                                                                                      
....:     pass                                                                                                             
sage: cputime >= 0.0                                                                                                       
True                                                                                                                       
sage: isinstance(cputime, float)                                                                                           
True

The failure is,

Failed example:
    isinstance(cputime, float)
Expected:
    True
Got:
    False

The function either raises an OSError, or should return a float divided by an integer, which I'm expecting to be a float itself. I could just cast it with return float(...), but if anyone with macOS can reproduce it, I'd rather know why I'm casting it.

Well, sure:

sage: cputime = 0.0
sage: isinstance(cputime, float)
False

You forgot (a) the preparser makes 0.0 into a RealLiteral and (b) a RealLiteral is not an instance of a float.

orlitzky · 2024-04-23T12:19:01Z

You forgot (a) the preparser makes 0.0 into a RealLiteral and (b) a RealLiteral is not an instance of a float.

FFS. Thanks.

orlitzky · 2024-04-23T15:32:47Z

Still the two modularization failures. Judging from the line,

include sage/misc/timing.p*             # walltime, cputime used in sage.doctest

in sagemath-objects/MANIFEST.in, we probably need to include sage.interfaces.quit there as well (because now sage.doctest uses it to obtain the list of running "expect" interfaces).

I dare not make this change myself. @mkoeppe y/n?

orlitzky · 2024-10-11T22:43:15Z

Rebased again. I'll try to make that modularization fix myself I guess.

If anyone is tired of repeatedly having to split up the test runs in CI only to still get false positives due to timeouts, this is an important step towards addressing the root cause of those failures.

Like ZombieProcess... this can happen. And the doctest framework shouldn't crash if a pexpect process was killed.

We're importing psutil and trying to catch the ImportError if one is raised. But for that to work, it helps to put "import" inside of the "try" block.

If SystemExit occurs during a doctest, we re-raise it. The same thing should be done for SignalError (from cysignals). Signals can occur at any time, and the doctest runner shouldn't eat them. In particular, this can lead to a Heisenbug where a signal will terminate a doctest example before the timer has a chance to annotate the it with the elapsed walltime and cputime. In that case, we get an "impossible" AttributeError later on when we try to compute the total walltime used by the example.

orlitzky · 2024-10-16T22:46:30Z

CI is finally green 💚 🍏 📗 🟢

tobiasdiez

The tests report slow docstests although those are marked as "long time" (see annotations at the bottom in https://github.com/sagemath/sage/actions/runs/11373830486?pr=36226). Is this by design because those tests take longer than 30s? If yes, what is the plan with these "super long" tests?

Other than this point, the PR looks good to me.

tobiasdiez · 2024-10-19T23:58:53Z

src/sage/doctest/control.py

+        # being rather generous here.
+        self.options.warn_long = 5.0
+        if self.options.long:
+            self.options.warn_long = 30.0

    def second_on_modern_computer(self):


This method can now be deleted, right?

Yes, but it's a "public" function, so I didn't want to start an argument over the deprecation policy.

I think it is safe to consider everything in doctest as private. But as a compromise, at least add a deprecation warning so that this can be removed in 1year?

orlitzky · 2024-10-20T00:25:39Z

The tests report slow docstests although those are marked as "long time" (see annotations at the bottom in https://github.com/sagemath/sage/actions/runs/11373830486?pr=36226). Is this by design because those tests take longer than 30s? If yes, what is the plan with these "super long" tests?

Right. Tests that are marked # long time are only run with sage -t --long, but that's rather independent of the warnings. Currently the doctest runner picks some "random," too-large number to set as the default --warn-long value. For example,

$ sage -t src/sage/env.py
...
sage -t --warn-long 77.9 --random-seed=43834777074638100050904209660685805841 src/sage/env.py

That number doesn't depend on --long, and even for # long time tests, it is much too large. Our guidelines say that a normal test should complete in under 1s, and a "long" test in 5s. As it stands, we never get warnings about non-long tests that run too long, because the --warn-long value is high enough to accommodate even badly-behaving # long time tests. We also get few warnings about the # long time tests, because again, the number is just too high.

I've changed this in two ways:

The default --warn-long number depends on whether or not --long was used. This lets us detect slow non-long tests when --long is not used.
Those --warn-long numbers are deterministic, and low enough to be useful.

While 30s is still "super long," I'm trying to avoid turning the test output into an unreadable mess of warnings. We have a lot of tests that are too slow because the current behavior doesn't catch them. My plan was to start at something no one could object to (30s) and then lower it slowly as the existing issues get fixed.

tobiasdiez · 2024-10-20T00:32:47Z

Okay, this makes a lot of sense.

However, on CI you now have quite a bit of warnings at the end - which are usually unrelated to the PR. While this might stimulate people into helping making these tests quicker, I would prefer if the annotations are actually relevant to a given PR.

Can you make the warn-long times even higher on CI (check for the CI env variable should do)? Or otherwise hide those warnings for now?

This method of the DocTestController class is no longer used in the sage library, but technically, it was "public," so we deprecate it for later removal.

orlitzky · 2024-10-20T13:06:46Z

Okay, this makes a lot of sense.

However, on CI you now have quite a bit of warnings at the end - which are usually unrelated to the PR. While this might stimulate people into helping making these tests quicker, I would prefer if the annotations are actually relevant to a given PR.

Can you make the warn-long times even higher on CI (check for the CI env variable should do)? Or otherwise hide those warnings for now?

Yeah I didn't notice those annotations. I guess github turns the word "Warning" into an error, and doesn't give you a way to disable it, selectively or otherwise. I would like to keep the "slow doctest" warnings from the CI, just not the fake errors. I pushed a commit that will change the "Warning, slow doctest" into "Uh oh, slow doctest" when running under the CI. If it works, we can still ctrl-f the logs for "slow doctest," but they won't get converted to error annotations.

tobiasdiez · 2024-10-20T14:05:46Z

These annotations are generated here:

sage/src/sage/doctest/reporting.py

Lines 263 to 280 in 7726cd9

    
           elif format == 'github': 
        
               # https://docs.github.com/en/actions/using-workflows/workflow-commands-for-github-actions#using-workflow-commands-to-access-toolkit-functions 
        
               command = f'::error title={fail_msg}' 
        
               command += f',file={source.printpath}' 
        
               if output: 
        
                   if m := re.search("## line ([0-9]+) ##\n-{40,100}\n(.*)", output, re.MULTILINE | re.DOTALL): 
        
                       lineno = m.group(1) 
        
                       message = m.group(2) 
        
                       command += f',line={lineno}' 
        
                   else: 
        
                       message = output 
        
                   # Urlencoding trick for multi-line annotations 
        
                   # https://github.com/actions/starter-workflows/issues/68#issuecomment-581479448 
        
                   message = message.replace('\n', '%0A') 
        
               else: 
        
                   message = "" 
        
               command += f'::{message}' 
        
               log(command)

It might be enough to use notice instead of error as the reporting mechanism for too-long-runs. Then they still would show up as annotations but not in red (no idea how exactly they are displayed). Alternatively, don't add the file + line info for too-long-warnings in which case the would only show up in the github action summary page but not in the review/code display. I would be fine with both solutions.

orlitzky · 2024-10-20T14:42:18Z

These annotations are generated here:

I looked at this but I didn't see a connection to the slow doctest warnings. The error annotations are added in _log_failure(), and shouldn't affect anything unless the test fails? I wanted to change the "Warning" label as an experiment, but not even that has worked. For whatever reason, os.environ.get("CI") is not returning "true" under the CI.

tobiasdiez · 2024-10-20T14:51:23Z

These annotations are generated here:

I looked at this but I didn't see a connection to the slow doctest warnings. The error annotations are added in _log_failure(), and shouldn't affect anything unless the test fails?

No idea...but I assumed it would be generated there, since these annotations are "errors" (red) and not "warnings" (yellow). So if github would just magically parse "Warnings" then they would most likely be shown in yellow.

I wanted to change the "Warning" label as an experiment, but not even that has worked. For whatever reason, os.environ.get("CI") is not returning "true" under the CI.

Probably because the ci uses its own docker container and never sets CI as would be normally the case with github. Try to check for

sage/.ci/write-dockerfile.sh

Line 269 in 7726cd9

ENV GITHUB_ACTIONS=1

orlitzky · 2024-10-20T15:05:31Z

Ah, thanks. I see invocations like,

2024-10-20T13:04:25.5703322Z ##[command]/usr/bin/docker create --name b3360c32c4bf46369bb60408efada007_registry2_ae6c9c --label 00c75c --network github_network_152498b2c6d54a50922b8e3fb958afe5 --network-alias registry -p 5000:5000  -e GITHUB_ACTIONS=true -e CI=true registry:2

So maybe we just need to pass $CI through to the dockerfile as well. We'll know in an hour...

orlitzky · 2024-10-20T18:56:53Z

Back to square one:

Github changes the warnings to error annotations, even if I change the word "warning" to "uh oh"
We are not in _log_failure, so the custom "github" report format is not playing a part (you can confirm with raise SystemExit or whatever)

I think I've done enough blind debugging of Microsoft's crap today.

Switch "Warning, slow doctest:" to "Warning: slow doctest:" so that the leading "Warning: " is consistent with other warnings. This ultimately prevents these slow doctest warnings from being converted into error annotations on the Github CI.

orlitzky · 2024-10-20T22:29:13Z

I finally got it to use warning annotations (and not errors) by switching the Warning, to Warning:

orlitzky · 2024-10-24T22:49:24Z

@tobiasdiez is it OK with the warning annotations, or do you want me to suppress them further?

tobiasdiez

I would prefer if they wouldn't show up at all, but it's fine like this for me.

Feel free to set it to positive review, or wait a little bit if you still want a second pair of eyes to look at the code.

orlitzky · 2024-10-25T12:29:24Z

Ok, thank you. I started working on this in December of 2021 so I'd quite like to take it off my list and start working on the follow-up issues.

orlitzky changed the title ~~Use CPU time for --warn-long~~ (Draft) Use CPU time for --warn-long Sep 9, 2023

orlitzky force-pushed the warn-long-cputime branch 3 times, most recently from 1f56451 to c3d1af6 Compare September 9, 2023 23:09

orlitzky changed the title ~~(Draft) Use CPU time for --warn-long~~ Use CPU time for --warn-long Sep 10, 2023

orlitzky added the s: needs review label Sep 10, 2023

orlitzky mentioned this pull request Sep 10, 2023

Use CPU time for --warn-long and a low default limit for non-long tests #32981

Closed

tornaria reviewed Sep 13, 2023

View reviewed changes

src/sage/doctest/util.py Outdated Show resolved Hide resolved

tornaria reviewed Sep 13, 2023

View reviewed changes

src/sage/doctest/control.py Outdated Show resolved Hide resolved

jhpalmieri reviewed Sep 21, 2023

View reviewed changes

src/sage/doctest/control.py Outdated Show resolved Hide resolved

jhpalmieri reviewed Sep 21, 2023

View reviewed changes

src/sage/doctest/control.py Outdated Show resolved Hide resolved

jhpalmieri reviewed Sep 27, 2023

View reviewed changes

src/sage/doctest/util.py Outdated Show resolved Hide resolved

jhpalmieri reviewed Sep 27, 2023

View reviewed changes

src/sage/doctest/util.py Outdated Show resolved Hide resolved

orlitzky force-pushed the warn-long-cputime branch from c019843 to 21e5d4d Compare September 29, 2023 11:58

vbraun force-pushed the develop branch from 883e05f to e349b00 Compare November 12, 2023 16:25

mkoeppe added the c: doctest framework label Feb 9, 2024

orlitzky force-pushed the warn-long-cputime branch from 21e5d4d to 247c635 Compare April 16, 2024 12:30

vbraun force-pushed the develop branch from eba5e19 to e5f42fa Compare June 3, 2024 22:15

orlitzky force-pushed the warn-long-cputime branch from d6862d7 to 4f049b6 Compare October 11, 2024 22:23

orlitzky force-pushed the warn-long-cputime branch from ab62b56 to 945ae45 Compare October 12, 2024 02:09

src/sage/doctest/util.py: handle NoSuchProcess from psutil, too

993327b

Like ZombieProcess... this can happen. And the doctest framework shouldn't crash if a pexpect process was killed.

orlitzky force-pushed the warn-long-cputime branch from 98dbd96 to 993327b Compare October 16, 2024 14:41

orlitzky added 2 commits October 16, 2024 16:54

src/sage/doctest/util.py: put "import" inside the "try"

7aa6be8

We're importing psutil and trying to catch the ImportError if one is raised. But for that to work, it helps to put "import" inside of the "try" block.

orlitzky force-pushed the warn-long-cputime branch from 12c0fa6 to cf55b38 Compare October 16, 2024 21:04

This was referenced Oct 17, 2024

Draft: doctest framework eats SignalErrors #38815

Closed

A few outrageously long tests #38827

Open

src/sage/env.py: canonicalize paths in a test #38826

Merged

tobiasdiez reviewed Oct 20, 2024

View reviewed changes

sage/doctest/control.py: deprecate second_on_modern_computer()

660e8ca

This method of the DocTestController class is no longer used in the sage library, but technically, it was "public," so we deprecate it for later removal.

orlitzky force-pushed the warn-long-cputime branch from 205462a to 0898772 Compare October 20, 2024 17:02

orlitzky force-pushed the warn-long-cputime branch from 0898772 to 660e8ca Compare October 20, 2024 18:58

tobiasdiez approved these changes Oct 25, 2024

View reviewed changes

orlitzky added s: positive review and removed s: needs review labels Oct 25, 2024

vbraun merged commit 7e4f7bd into sagemath:develop Oct 26, 2024
18 of 20 checks passed

github-actions bot removed the s: positive review label Oct 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use CPU time for --warn-long #36226

Use CPU time for --warn-long #36226

orlitzky commented Sep 9, 2023 •

edited

Loading

jhpalmieri Sep 27, 2023

orlitzky Sep 29, 2023

jhpalmieri Sep 29, 2023

orlitzky Sep 29, 2023

orlitzky Sep 29, 2023

jhpalmieri commented Sep 27, 2023

orlitzky commented Sep 29, 2023

github-actions bot commented Sep 29, 2023 •

edited

Loading

orlitzky commented Apr 17, 2024

orlitzky commented Apr 17, 2024

tornaria commented Apr 23, 2024

orlitzky commented Apr 23, 2024

orlitzky commented Apr 23, 2024

orlitzky commented Oct 11, 2024

orlitzky commented Oct 16, 2024

tobiasdiez left a comment

tobiasdiez Oct 19, 2024

orlitzky Oct 20, 2024

tobiasdiez Oct 20, 2024

orlitzky Oct 20, 2024

orlitzky commented Oct 20, 2024

tobiasdiez commented Oct 20, 2024

orlitzky commented Oct 20, 2024

tobiasdiez commented Oct 20, 2024

orlitzky commented Oct 20, 2024

tobiasdiez commented Oct 20, 2024

orlitzky commented Oct 20, 2024

orlitzky commented Oct 20, 2024

orlitzky commented Oct 20, 2024

orlitzky commented Oct 24, 2024

tobiasdiez left a comment

orlitzky commented Oct 25, 2024

Use CPU time for --warn-long #36226

Use CPU time for --warn-long #36226

Conversation

orlitzky commented Sep 9, 2023 • edited Loading

Examples

--warn-long output

pexpect accounting

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jhpalmieri commented Sep 27, 2023

orlitzky commented Sep 29, 2023

github-actions bot commented Sep 29, 2023 • edited Loading

orlitzky commented Apr 17, 2024

orlitzky commented Apr 17, 2024

tornaria commented Apr 23, 2024

orlitzky commented Apr 23, 2024

orlitzky commented Apr 23, 2024

orlitzky commented Oct 11, 2024

orlitzky commented Oct 16, 2024

tobiasdiez left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

orlitzky commented Oct 20, 2024

tobiasdiez commented Oct 20, 2024

orlitzky commented Oct 20, 2024

tobiasdiez commented Oct 20, 2024

orlitzky commented Oct 20, 2024

tobiasdiez commented Oct 20, 2024

orlitzky commented Oct 20, 2024

orlitzky commented Oct 20, 2024

orlitzky commented Oct 20, 2024

orlitzky commented Oct 24, 2024

tobiasdiez left a comment

Choose a reason for hiding this comment

orlitzky commented Oct 25, 2024

orlitzky commented Sep 9, 2023 •

edited

Loading

github-actions bot commented Sep 29, 2023 •

edited

Loading