-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci: "publish bleeding edge" failures often don't have issues opened for them #40414
Comments
https://teamcity.cockroachdb.com/admin/editBuildRunners.html?id=buildType:Cockroach_MergeToMaster Here's the build step that should do this. Here's a build where that claims to have worked but didn't: https://teamcity.cockroachdb.com/viewLog.html?buildId=1471297&buildTypeId=Cockroach_MergeToMaster&tab=buildLog&_focus=167#_state=167 I would start by running that script locally against that build to see if the problem is in posting the issue or in detecting the failed tests. |
I've been chasing the TeamCity API calls made by the |
I changed the nightlies to not depend on publish bleeding edge. They just pull the most recently generated binaries from the same branch. So at least that problem is solved. |
FWIW, this issue is still going wild: https://teamcity.cockroachdb.com/viewLog.html?buildId=1511685&buildTypeId=Cockroach_UnitTests None of these failures had issues opened for them. They're new issues, so it's especially unfortunate that we don't get to see them. I'm not aware that anyone else is monitoring teamcity the way I feel forced I have to. |
I'll take a look at this very soon. Maybe the script has rotten. |
@jordanlewis did you get a chance to look into this? |
42565: build: use json output and avoid go-test-teamcity r=jordanlewis a=tbg Previously, our tests would produce text output and go-test-teamcity would be charged with translating from that to TeamCity output. This was buggy, error-prone and undertested. Parsing messy test output is hard and there are all kinds of corner cases. Instead of trying to solve this ourselves, rely on the built-in json output that the go toolchain provides. This output is far from perfect - I'm pretty sure it has some bugs as well - but it will become better over time without any involvement on our end. Additionally, TeamCity knows how to parse this format already, so go-test-teamcity goes away, and - hopefully - with each TeamCity upgrade the way the tests are handled is going to improve. There is one thing that go-test-teamcity did that we would like to keep, namely filtering the test output to omit the output for tests that passed. Our tests are really chatty, and we don't want to pollute our TeamCity machines with 200mb log files. So I wrote a little tool that does *just* this filtering (called testfilter) which we'll invoke instead of go-test-teamcity. The pipeline looks something like this: ``` make test JSON=true ... \ testfilter -mode=strip \ | tee tmp.json testfilter -mode=omit < tmp.json | testfilter -convert > failures.txt rm tmp.json ``` where - `mode=strip` removes the output for non-failing tests (but not the start-stop events). TeamCity will thus learn about all tests, but won't be burdened with retaining (or even parsing) test output that is irrelevant. - `mode=omit` only keeps failing tests (i.e. will never even mention any others). We pipe this to: - `mode=convert` to get back to human-readable text, in case someone would like to download an artifact with only relevant lines in it. I don't expect anyone to really need the artifact, because TeamCity lets you copy the "stack trace" (=test output) to clipboard from the UI, but you never know. The very nature of how robust our CI pipeline is dictates that I will have broken something, somewhere. I'll be watching out for fallout when this merges. However there are also at least two improvements: - the "stress pull request" failures are now properly recognized by TeamCity; previously one had to sift through the build log. - teamcity-post-failures.py is no more (actually it still is, but it's a noop, just to make sure we don't have to maintain two separate TeamCity configs for before/after this change). Issue posting is now done via github-post. Fixes #15034. Informs #40414. Release note: None 42921: Added myself to AUTHORS file. r=aaron-crl a=aaron-crl Co-authored-by: Tobias Schottdorf <tobias.schottdorf@gmail.com> Co-authored-by: Aaron Blum <aaron@cockroachlabs.com>
teamcity-post-failures.py is now a noop on master; we use the same tool that nightly stress uses to post issues from release branches now. Will close this once I've confirmed it working. |
Re-ping on this. I haven't done anything on it, but it would be great if this script could do something on failure - I've gotten too used to polling #teamcity on Slack to find failures. For example, check out this recent failure: https://teamcity.cockroachdb.com/viewLog.html?buildId=1995033&tab=buildResultsDiv&buildTypeId=Cockroach_MergeToMaster I think this is happening during |
I don't think we ever opened issues on master when the build failed (or did teamcity-post-failures.py do that as some sort of accidental side effect?), it was always for posting issues based on Go tests failing. |
This build is pretty red. Not having it green means that we don't have these binaries stageable, and I think it also impacts the nightly roachtest builds which has the bleeding edge build as a prereq. We should not lose this kind of test coverage at this point.
https://teamcity.cockroachdb.com/buildConfiguration/Cockroach_MergeToMaster?branch=%3Cdefault%3E&buildTypeTab=overview
Nobody really checks this page, so we need to fix the issue posting ASAP.
Examples here: https://cockroachlabs.slack.com/archives/CJ0H8Q97C/p1567069131004000
Epic CRDB-10439
Jira issue: CRDB-5537
The text was updated successfully, but these errors were encountered: