Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bazel/ci: No pass/skip/failed event for test flakes #88048

Closed
msbutler opened this issue Sep 16, 2022 · 5 comments · Fixed by #88499
Closed

bazel/ci: No pass/skip/failed event for test flakes #88048

msbutler opened this issue Sep 16, 2022 · 5 comments · Fixed by #88499
Assignees
Labels
C-test-failure Broken test (automatically or manually discovered). sync-me

Comments

@msbutler
Copy link
Collaborator

msbutler commented Sep 16, 2022

See CI info here.

Jira issue: CRDB-19661

@msbutler msbutler added C-test-failure Broken test (automatically or manually discovered). T-sql-queries SQL Queries Team labels Sep 16, 2022
@msbutler msbutler changed the title TestWindowFramer flaked on bazel essential ci in release-22.2 queries: TestWindowFramer flaked on bazel essential ci in release-22.2 Sep 16, 2022
@DrewKimball
Copy link
Collaborator

Log failure is this:

[15:49:58]F:			 [TestWindowFramer/mode=Groups/start=OffsetFollowing/end=UnboundedFollowing#01/count=17] No pass/skip/fail event found for test
[15:49:58]F:			 [TestWindowFramer/mode=Groups/start=OffsetFollowing/end=UnboundedFollowing#01/count=17] === RUN   TestWindowFramer/mode=Groups/start=OffsetFollowing/end=UnboundedFollowing#01/count=17

I don't see anything else in the logs that looks like it could be causing this, and running that test 15,000 times under stress didn't turn anything up.

@yuzefovich
Copy link
Member

Saw it on master too (on #88195).

@DrewKimball
Copy link
Collaborator

Maybe it has something to do with the fact that the random testing can generate duplicate tests - both of these cases have end=UnboundedFollowing#01. I'm guessing the #01 is appended to de-duplicate the names - maybe something is going wrong with that? I guess an easy way to prevent duplicates would be to increment and print out a counter for each test. @yuzefovich does this seem reasonable to you?

@yuzefovich
Copy link
Member

Hm, it seems unlikely.

I do see this in the log:

            --- PASS: TeI220916 15:19:35.406930 702 3@pebble/event.go:649  [pebble,s?] 2  [JOB 1] MANIFEST created 000001
I220916 15:19:35.407183 702 3@pebble/event.go:677  [pebble,s?] 3  [JOB 1] WAL created 000002
I220916 15:19:35.407368 702 3@pebble/event.go:645  [pebble,s?] 4  upgraded to format version: 002
I220916 15:19:35.407412 702 3@pebble/event.go:645  [pebble,s?] 5  upgraded to format version: 003
I220916 15:19:35.407436 702 3@pebble/event.go:645  [pebble,s?] 6  upgraded to format version: 004
I220916 15:19:35.407456 702 3@pebble/event.go:645  [pebble,s?] 7  upgraded to format version: 005
I220916 15:19:35.407478 702 3@pebble/event.go:645  [pebble,s?] 8  upgraded to format version: 006
I220916 15:19:35.407500 702 3@pebble/event.go:645  [pebble,s?] 9  upgraded to format version: 007
I220916 15:19:35.407524 702 3@pebble/event.go:645  [pebble,s?] 10  upgraded to format version: 008
I220916 15:19:35.407547 702 3@pebble/event.go:645  [pebble,s?] 11  upgraded to format version: 009
I220916 15:19:35.407584 702 3@pebble/event.go:645  [pebble,s?] 12  upgraded to format version: 010
I220916 15:19:35.407968 702 sql/colexec/colexecwindow/window_functions_test.go:1035  [-] 13  spillForced=true/windowFunc:ROW_NUMBER
I220916 15:19:35.408041 702 sql/colexec/colexectestutils/utils.go:691  [-] 14  batchSize=1/sel=false
I220916 15:19:35.409582 702 sql/colexec/colexectestutils/utils.go:691  [-] 15  batchSize=1/sel=true
I220916 15:19:35.410661 702 sql/colexec/colexectestutils/utils.go:691  [-] 16  batchSize=3/sel=false
I220916 15:19:35.411499 702 sql/colexec/colexectestutils/utils.go:691  [-] 17  batchSize=3/sel=true
I220916 15:19:35.412282 702 sql/colexec/colexectestutils/utils.go:630  [-] 18  randomNullsInjection
I220916 15:19:35.412862 744 3@pebble/event.go:669  [pebble,s?] 19  [JOB 4] all initial table stats loaded
I220916 15:19:35.413143 702 sql/colexec/colexectestutils/utils.go:364  [-] 20  allNullsInjection
I220916 15:19:35.414728 702 sql/colexec/colexecwindow/window_functions_test.go:1035  [-] 21  spillForced=true/windowFunc:RANK
I220916 15:19:35.414755 702 sql/colexec/colexectestutils/utils.go:691  [-] 22  batchSize=1/sel=false
stWindowFramer/mode=Groups/start=OffsetFollowing/end=UnboundedFollowing#01/count=17 (0.00s)
                --- PASS: TestWindowFramer/mode=Groups/start=OffsetFollowing/end=UnboundedFollowing#01/count=17/ordered/asc=true/typ=timestamptz/exclusion=NoExclusion (0.00s)
                --- PASS: TestWindowFramer/mode=Groups/start=OffsetFollowing/end=UnboundedFollowing#01/count=17/ordered/asc=false/typ=timestamptz/exclusion=NoExclusion (0.00s)

so it seems like the problem is that we intertwine the test output with other logs. I don't see anything wrong in the test, so maybe it's a quirk of Bazel CI? cc @rickystewart

@rickystewart
Copy link
Collaborator

cockroachdb/rules_go#8 should fix this I think.

@rickystewart rickystewart self-assigned this Sep 21, 2022
@rickystewart rickystewart changed the title queries: TestWindowFramer flaked on bazel essential ci in release-22.2 bazel/ci: No pass/skip/failed event for test flakes Sep 21, 2022
@yuzefovich yuzefovich removed the T-sql-queries SQL Queries Team label Sep 22, 2022
craig bot pushed a commit that referenced this issue Sep 22, 2022
88454: ui: insights transaction details support multiple blocking transactions r=j82w a=j82w

This adds support for multiple blocking transactions for a single waiting transaction. The cards were merged into the table, and the data was piped through to show multiple rows. The total contention time was also fixed to aggregate the contention time instead of just picking the latest.

before:
https://loom.com/share/0384ed937a344e2fb0105fefbc313acb

after:
https://www.loom.com/share/78e906f50a694cd59ac893ddb9c2239a

closes #88264

Release justification: Category 2: Bug fixes and
low-risk updates to new functionality

Release note: (ui change): Add support for multiple
 blocking transaction on insights transaction
 details page. Merged the cards into the table,
 and fixed the total contention time.

88470: *: upgrade grpc to v1.47.0 r=erikgrinaker a=pavelkalinnikov

Fixes #81227

Release note: upgrade grpc from v1.46.0 to v.1.47.0 which fixes a subtle bug
causing panic on a nil pointer.

88477: keys: mark 49 as reserved r=ajwerner a=ajwerner

Release note: None

88496: persistedsqlstats: speed up a test r=yuzefovich a=yuzefovich

Previously, a single unit test could take on the order of 4 minutes (or even exceed 5 minute timeout, rarely) because the job monitor checks whether a cluster setting has been updated only every minute, and we update the cluster setting twice in a unit test. This commit makes it so that in a testing setup the check happens every second.

Release note: None

88499: bazel: upgrade `rules_go` r=rail a=healthy-pod

Pull in cockroachdb/rules_go#8.

Closes #88048

Release note: None

Co-authored-by: j82w <jwilley@cockroachlabs.com>
Co-authored-by: Pavel Kalinnikov <pavel@cockroachlabs.com>
Co-authored-by: Andrew Werner <awerner32@gmail.com>
Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
Co-authored-by: healthy-pod <ahmad@cockroachlabs.com>
@craig craig bot closed this as completed in 8cd7f30 Sep 22, 2022
healthy-pod pushed a commit to healthy-pod/cockroach that referenced this issue Nov 9, 2022
healthy-pod pushed a commit to healthy-pod/cockroach that referenced this issue Nov 14, 2022
@rytaft rytaft added the sync-me label Jul 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-test-failure Broken test (automatically or manually discovered). sync-me
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants