-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix codecov metrics #3039
Comments
I was wondering if the release optimisations where impacting the code coverage reports, so I ran a quick test of the CI over on my fork with a commit reverting back the debug build to generate the code coverage. main...ruffsl:2a946f1a349083c1285e29fbf7d5d72338ddadb9 https://app.codecov.io/gh/ruffsl/navigation2/commit/2a946f1a349083c1285e29fbf7d5d72338ddadb9 It didn't seem to improve the overall percentage, and the files that did change seem to be nondeterministic, given no code was changed between before and after the CI revert to debug. Do we have any other hypothesis? |
I think there's usually a little noise between runs due to the nature of system level smoke tests, but it shouldn't be massively different in the final number. That's odd, maybe the default compiling flags have changed in I was wondering also if a system test fails, if there could have been a change that resulted in none of the coverage metrics by the package being reported - though that may be an ill-informed thought, I haven't looked into detail about how exactly code coverage metrics are captured |
Tested: removing Edit: did not help. Nor did using
|
@tylerjw , would you have any ideas here? I'm thinking if it isn't lcov, then perhaps the bump in gcc version has changed the effectiveness of the default gcc args specified using the either coverage-gcc colcon mixin. Was moveit's code coverage effected by the migration to 22.04? Edit: I suppose the moveit project relies more on clang than ggc though? |
Taking a look at the CircleCI insights for nightly jobs from the main branch, it looks like a spike in 1,695 skipped tests started around pipeline number 7325, where as before then the number of skipped tests was flatly zero: So from the logs, we can pinpoint the spike in skipped tests occurring between these two consecutive nightly jobs: https://app.circleci.com/pipelines/github/ros-planning/navigation2/7319
https://app.circleci.com/pipelines/github/ros-planning/navigation2/7325
Edit: The diff for this corresponds to... my PR with switching to ubuntu jammy. Of course it would... Still not sure how that resulted in the spike of skipped tests. A cmake or colcon thing? |
@nuclearsandwich do you have any experience with this? I suspect that whatever changed that may have impacted you as well to have looked into previously. Since we moved to Rolling on 22.04, this has been an issue. |
No it's not something that has crossed my feeds as far as I recall. The major jammy-related fallout that I've encountered has come down to 1. needing to update the docker seccomp profile so that new system calls are not masked inside jammy containers running on older hosts (primarily affects you if you're running docker from the Debian or Ubuntu repositories rather than from the official Docker repositories) or 2. Phased rollout of stable release updates which affects apt-get / apt cli rather than just the GUI tools as of 22.04. But both failures are usually more catastrophic for builds than what you're seeing. @ruffsl nav2 CI currently uses the nightly packaging job archives right? |
No, we have switched to rolling for the time being, as all of our dependencies have been upstreamed, so it make development a bit more stable to work with.
Not too hard, I pushed some commits to my fork that revert the Docker image to build from nightly. You could modify from base image and URL or COPY the tarball in the nightly dockerfile, build and tag it, then change the FROM image in the Dockerfile on my fork to try it out. If you give me a URL for focal, I could also do it out myself. One can also run all the tests by changing the default build args, E.g:
Doing this using the current
|
If https://ci.ros2.org/job/ci_packaging_linux/514 completes successfully, it should give you a devel archive built on focal. |
@nuclearsandwich , I modified the ros2 nightly dockerfile to build from focal with your provided tarball like so: osrf/docker_images@master...ruffsl:docker_images:ros2-nightly-focal But it looks like there is the package
|
Just breathing some life back into this if there's anything new we can try. It would be great to have accurate metrics here after we spent so much time creating these system-level tests! |
I did an audit, we're missing coverage in, most notably,
All of which have system tests associated with them. Below is the attached file from the test run I analyzed. I'm focusing in on the WP Follower test since that involves some of these servers and has a dedicated test (22 in this file, starting at line 5599). I see the following error:
But otherwise, seems to work the way it should and ends with Though I do see it has to be escalated to exit the process
Could that be related? |
Well that's neat-o @ruffsl I found that if I removed composition (e.g. https://app.codecov.io/gh/ros-planning/navigation2/pull/3266 shows 85.94% test coverage just swapping the WPF test over to that. I just updated #3266 to migrate all tests over, lets see how it goes @nuclearsandwich I don't expect you to do anything about that - but have you heard about coverage metrics being lost with composition before, perhaps? Or if sigkill has to be invoked, does that mess with it (that might not be an question for you though) |
Whoa, that is trippy. Changes to the inner working of the node composting during the rolling migration could explain the coverage regression. I just wonder what exactly the change impacted. Could the way that the processes are exited be interfering with writing coverage files to disk, say for some nodes in the compose node but not for all others, when in enabled using the GCC coverage flags? |
I honestly don't know, I just finished confirming this is true by migrating all the tests over to it from my single-test trial. I'm not sure which it is (or some mixture of both). https://app.codecov.io/gh/ros-planning/navigation2/pull/3266 back up to the old regime at 88.6%. I think we should technically be higher than this, but that's still a major improvement |
For instance, the assisted teleop / drive on heading plugins for the behavior server are still missing large chunks of coverage even though their system tests should be invoking those functions. Otherwise though, this appears to be what I would expect in broad strokes. |
I'm not sure what I should do about this ticket now. Broadly I'd say its complete, but I feel a little uneasy about not investigating root causes, though I have to move on with my week. I'll leave it for now, but may close at a later date if either of us get a chance to look at it more. I will at some point need to look at why 2 of the behavior server plugins are low coverage, but those are also both new plugins since we had these issues so it wouldn't surprise me if there was actually just some issue in those tests that need to be changed and not representative of the same issues |
Nope. I haven't heard but I'm going to pass that along in case anyone else sees some dots to connect. |
I'm going to open another ticket regarding the Assisted Teleop / Drive on Heading lack of testing. I think this ticket is otherwise done (for now), though it would be nice to be able get coverage metrics with composition - its unclear if its us (not clean exit) or composition in general causing it. |
Coverage metrics for system tests are not being counted, potentially among other things
CC @ruffsl
The text was updated successfully, but these errors were encountered: