Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exclude non-executable lines #892

Merged
merged 10 commits into from
Feb 16, 2022
Merged

Exclude non-executable lines #892

merged 10 commits into from
Feb 16, 2022

Conversation

Slamdunk
Copy link
Contributor

@Slamdunk Slamdunk commented Feb 9, 2022

Fixes #889

NOTE: Tests that fail only represent the new behavior which excludes what StaticAnalysis\ExecutableLinesFindingVisitor tells it's not a LoC: I'll update the entire test suite as soon as this PR receives a positive feedback

Taking the examples from https://github.com/Slamdunk/phpunit_codecoverage_bug#readme here the result:

Before After
image image
image image
image image
image image

@sebastianbergmann
Copy link
Owner

This looks good to me at first glance. What do you think, @dvdoug?

@dvdoug
Copy link
Contributor

dvdoug commented Feb 9, 2022

Theory looks sound, although it makes me a little sad that we have to determine this in userland

foreach (array_keys($data->lineCoverage()) as $filename) {
$data->keepCoverageDataOnlyForLines(
$filename,
$this->uncoveredFileAnalyser()->executableLinesIn($filename)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be moved out of uncoveredFileAnalyser if it's going to be called for covered files as well

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there will also a performance impact from calling executableLinesIn so much that it might be a good idea to try and mitigate - if someone has caching enabled, then that will read from file each time but it might be a good idea to add a memory cache somewhere? Reads on Windows particularly are notoriously slow, and I think this would do an additional read per source file per test if my code-skim is correct?

Copy link
Contributor

@dvdoug dvdoug Feb 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For non-cached version:

Current

    public function enterNode(Node $node): void
    {
        if (!$this->isExecutable($node)) {
            return;
        }

        $this->executableLines[] = $node->getStartLine();
    }

    /**
     * @psalm-return list<int>
     */
    public function executableLines(): array
    {
        $executableLines = array_unique($this->executableLines);

        sort($executableLines);

        return $executableLines;
    }

Faster?



    public function enterNode(Node $node): void
    {
        if (!$this->isExecutable($node)) {
            return;
        }

        $this->executableLines[$node->getStartLine()] = $node->getStartLine();
    }

    /**
     * @psalm-return list<int>
     */
    public function executableLines(): array
    {
        return $executableLines;
    }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be moved out of uncoveredFileAnalyser if it's going to be called for covered files as well

Sure, I'll leave refactoring after the main solution has landed with all tests passing

I think there will also a performance impact from calling executableLinesIn so much

I have no experience on Windows, but on Linux real cases give me negligible impact:

Files: 906, Methods: 4402, Lines: 33708
Tests: 2989, Assertions: 19314, Skipped: 22

3 runs on Paratest, Standard SATA disk for cache

Before

Time: 02:02.831, Memory: 200.54 MB
Generating code coverage report ... done [00:06.284]

Time: 02:03.884, Memory: 198.84 MB
Generating code coverage report ... done [00:05.837]

Time: 02:02.206, Memory: 200.58 MB
Generating code coverage report ... done [00:05.936]

After

Time: 02:04.346, Memory: 176.52 MB
Generating code coverage report ... done [00:05.902]

Time: 02:06.492, Memory: 176.50 MB
Generating code coverage report ... done [00:05.635]

Time: 02:05.663, Memory: 176.52 MB
Generating code coverage report ... done [00:05.731]

$this->executableLines[$node->getStartLine()] = $node->getStartLine();

I thought this too: be aware that we need to version the cache, otherwise new release will break user reports if run onto old caches

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's an interesting post by an MS engineer at microsoft/WSL#873 (comment) in the context of WSL that compares/contrasts performance of the filesystem in Linux vs Windows. Suffice to say it's real

Windows 11, PHP 8.1.2, XDebug 3.1.3, running tests for this repo

9.2 branch

Time: 00:19.726, Memory: 360.00 MB

Time: 00:21.713, Memory: 360.00 MB

Time: 00:23.309, Memory: 360.00 MB

This PR

Time: 00:53.541, Memory: 374.00 MB

Time: 00:44.610, Memory: 374.00 MB

Time: 00:48.638, Memory: 374.00 MB

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pointer, I'll have a look (hopefully tomorrow morning, otherwise over the weekend).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

I think the uncovered file analyser not having a memory cache is OK because it's normally only called once at the end of the entire test suite run just before generating the report, unlike the analyser for covered files that's called after every test. It's just that here, the relevant function is now being called from a very different context...

Copy link
Owner

@sebastianbergmann sebastianbergmann Feb 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed 3a3285f just now. It should not cause any harm, apart from producing a merge conflict for this PR.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought about this some more: the idea to separate CoveredFileAnalyzer and UncoveredFileAnalyzer from each other was because the former was required after each test and the latter only once at the end. Now we also need the data from UncoveredFileAnalyzer after each test (right?). If that is really the case, then we should merge the two into FileAnalyzer.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reverted 3a3285f (so that this PR no longer has conflicts). I think we should separate caching refactorings/improvements from the topic of this PR.

@Slamdunk Slamdunk marked this pull request as draft February 10, 2022 07:49
@codecov
Copy link

codecov bot commented Feb 10, 2022

Codecov Report

Merging #892 (ebe3dc9) into 9.2 (be9a612) will increase coverage by 0.17%.
The diff coverage is 89.47%.

Impacted file tree graph

@@             Coverage Diff              @@
##                9.2     #892      +/-   ##
============================================
+ Coverage     83.19%   83.37%   +0.17%     
- Complexity     1112     1131      +19     
============================================
  Files            62       62              
  Lines          3648     3555      -93     
============================================
- Hits           3035     2964      -71     
+ Misses          613      591      -22     
Impacted Files Coverage Δ
...rc/StaticAnalysis/CachingUncoveredFileAnalyser.php 0.00% <0.00%> (ø)
src/CodeCoverage.php 50.00% <100.00%> (+5.04%) ⬆️
src/RawCodeCoverageData.php 97.53% <100.00%> (+0.06%) ⬆️
...c/StaticAnalysis/ExecutableLinesFindingVisitor.php 100.00% <100.00%> (ø)
src/Driver/PcovDriver.php 38.88% <0.00%> (-5.56%) ⬇️
src/Report/PHP.php 88.88% <0.00%> (-2.78%) ⬇️
src/Report/Html/Renderer/Dashboard.php 96.61% <0.00%> (-2.63%) ⬇️
src/StaticAnalysis/IgnoredLinesFindingVisitor.php 86.66% <0.00%> (-2.23%) ⬇️
...rc/StaticAnalysis/ParsingUncoveredFileAnalyser.php 90.90% <0.00%> (-0.76%) ⬇️
src/Report/Xml/Coverage.php 94.73% <0.00%> (-0.27%) ⬇️
... and 15 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update be9a612...ebe3dc9. Read the comment docs.

Copy link
Contributor Author

@Slamdunk Slamdunk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, here's few notes about the current state of the PR. I'll refer to the single commits as they are easier to review one by one:

  1. ea7bbe0 introduces the edits needed to address a variation of the original issue that I found wasn't covered at all due to missing BinaryOp and CallLike in executable statement list
  2. 4520d9d splits line and path coverage into separate handling: my research on xDebug path/branch coverage tells me it's a completely different algorithm that doesn't intersect well with normal line coverage. A bonus of this commit is that it shows that keepFunctionCoverageDataOnlyForLines has always been completely uncovered by tests
  3. 564916a is the tedious work for updating the test suite against the new logic.
    The biggest thing emerged by this commit is that priority line checks on CC merge now never encounters unset nor dead LoC, see https://app.codecov.io/gh/sebastianbergmann/php-code-coverage/compare/892/changes#D6L201
    image
    I consider this a good hint on the behavior of the change. I'll delete those lines if the review of this PR is positive: we don't need that code anymore
    see Exclude non-executable lines #892 (comment)
  4. So I tried the PR against my biggest prod project, and 91be05c emerged: I had to fine tune ExecutableLinesFindingVisitor to match xDebug/PCOV out at our best.
  5. 2f8bbbd is the optimization suggested by @dvdoug
  6. 07df4ac is needed in order to re-cache everything at next release. I made up a simple v2 prefix: maybe we can leverage the Version class, but I don't see the need for such a massive re-cache at every release, @sebastianbergmann WDYT about this?
  7. 4a9b10f to make test pass on PHP < 8.0. $this ?-> state was there to test NullsafePropertyFetch nodes, but I was a bit lazy and don't want, for now, to set up a completely new test just for that ^^

I hope I gave all and only the infos necessary to have you read this new PR state easily

@Slamdunk Slamdunk marked this pull request as ready for review February 10, 2022 15:44
@Slamdunk
Copy link
Contributor Author

db5894a introduces a new LoC on BankAccount which is a real dead LoC, so that ProcessedCodeCoverageData::priorityForLine is properly fully tested again.

But there's an issue: any try I had to adopt tests/_files/Report/HTML/PathCoverageForBankAccount/BankAccount.php_path.html leads to preg_match(): Compilation failed: regular expression is too large at offset errors, even when I just copy-paste the actual output: @sebastianbergmann how should I handle this?

@dvdoug
Copy link
Contributor

dvdoug commented Feb 11, 2022

db5894a introduces a new LoC on BankAccount which is a real dead LoC, so that ProcessedCodeCoverageData::priorityForLine is properly fully tested again.

But there's an issue: any try I had to adopt tests/_files/Report/HTML/PathCoverageForBankAccount/BankAccount.php_path.html leads to preg_match(): Compilation failed: regular expression is too large at offset errors, even when I just copy-paste the actual output: @sebastianbergmann how should I handle this?

Oh I remember those errors from making the original - if you look closely at the existing file, there are a lot of %s placeholders, e.g. the parts reproducing the source code (because those are effectively tested elsewhere), and many of the class names and other HTML attributes. There are some full ones left to in order to test various scenarios, but as you've discovered it's not possible to do full-match testing of this particular file as it's too large

<script src="_js/popper.min.js" type="text/javascript"></script>
<script src="_js/bootstrap.min.js" type="text/javascript"></script>
<script src="_js/file.js" type="text/javascript"></script>
%a
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is needed in order to avoid the preg_match error. I think this footer part doesn't add value to the overall test suite, so it should be safe

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the footer does not add value. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants