Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: cdc/sink-chaos/rangefeed=false failed [skipped] #36018

Closed
cockroach-teamcity opened this issue Mar 21, 2019 · 3 comments · Fixed by #36852
Closed

roachtest: cdc/sink-chaos/rangefeed=false failed [skipped] #36018

cockroach-teamcity opened this issue Mar 21, 2019 · 3 comments · Fixed by #36852
Assignees
Labels
C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot.

Comments

@cockroach-teamcity
Copy link
Member

SHA: https://github.com/cockroachdb/cockroach/commits/5b36cc6276340282cb333ff4a9cb4f1fbd6c3348

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=cdc/sink-chaos/rangefeed=false PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1189990&tab=buildLog

The test failed on release-2.1:
	cluster.go:1267,cdc.go:625,cdc.go:125,cluster.go:1605,errgroup.go:57: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1189990-cdc-sink-chaos-rangefeed-false:4 -- ./workload run tpcc --warehouses=100 --duration=30m --tolerate-errors {pgurl:1-3}  returned:
		stderr:
		
		stdout:
		       0            2.0            2.3     14.7     30.4     30.4     30.4 stockLevel
		   2m35s        0            3.0            2.2     52.4     54.5     54.5     54.5 delivery
		   2m35s        0           23.0           21.5     29.4     39.8     39.8     39.8 newOrder
		   2m35s        0            4.0            2.1      6.0      7.1      7.1      7.1 orderStatus
		   2m35s        0           19.0           22.3     17.8     22.0     23.1     23.1 payment
		   2m35s        0            1.0            2.3     27.3     27.3     27.3     27.3 stockLevel
		   2m36s        0            3.0            2.2     48.2     54.5     54.5     54.5 delivery
		   2m36s        0           27.0           21.5     29.4     39.8     44.0     44.0 newOrder
		   2m36s        0            1.0            2.1      6.3      6.3      6.3      6.3 orderStatus
		   2m36s        0           19.0           22.2     15.2     17.8     18.9     18.9 payment
		   2m36s        0            1.0            2.2     10.0     10.0     10.0     10.0 stockLevel
		: signal: killed
	cluster.go:1626,cdc.go:213,cdc.go:417,test.go:1214: unexpected status: failed

@cockroach-teamcity cockroach-teamcity added C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot. labels Mar 21, 2019
@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/c11656058e4a36c0c62275d7c188ef8921e02928

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=cdc/sink-chaos/rangefeed=false PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1191975&tab=buildLog

The test failed on release-2.1:
	cluster.go:1267,cdc.go:633,cdc.go:133,cluster.go:1605,errgroup.go:57: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1191975-cdc-sink-chaos-rangefeed-false:4 -- ./workload run tpcc --warehouses=100 --duration=30m --tolerate-errors {pgurl:1-3}  returned:
		stderr:
		
		stdout:
		       0            0.0            2.4      0.0      0.0      0.0      0.0 stockLevel
		   2m19s        0            2.0            2.3     35.7     44.0     44.0     44.0 delivery
		   2m19s        0           35.0           21.1     32.5     39.8     44.0     44.0 newOrder
		   2m19s        0            0.0            2.3      0.0      0.0      0.0      0.0 orderStatus
		   2m19s        0           23.0           23.2     15.7     18.9     24.1     24.1 payment
		   2m19s        0            2.0            2.4     12.1     16.3     16.3     16.3 stockLevel
		   2m20s        0            1.0            2.3     50.3     50.3     50.3     50.3 delivery
		   2m20s        0           27.0           21.2     31.5     35.7     37.7     37.7 newOrder
		   2m20s        0            3.0            2.3      6.6      8.1      8.1      8.1 orderStatus
		   2m20s        0           19.0           23.1     15.7     19.9     19.9     19.9 payment
		   2m20s        0            2.0            2.4     12.1     14.7     14.7     14.7 stockLevel
		: signal: killed
	cluster.go:1626,cdc.go:221,cdc.go:425,test.go:1214: unexpected status: failed

@danhhz danhhz self-assigned this Mar 22, 2019
@cockroach-teamcity
Copy link
Member Author

SHA: https://github.com/cockroachdb/cockroach/commits/4f23ef547ad7af684f7b8cc349be8c1dc4d30aa3

Parameters:

To repro, try:

# Don't forget to check out a clean suitable branch and experiment with the
# stress invocation until the desired results present themselves. For example,
# using stress instead of stressrace and passing the '-p' stressflag which
# controls concurrency.
./scripts/gceworker.sh start && ./scripts/gceworker.sh mosh
cd ~/go/src/github.com/cockroachdb/cockroach && \
stdbuf -oL -eL \
make stressrace TESTS=cdc/sink-chaos/rangefeed=false PKG=roachtest TESTTIMEOUT=5m STRESSFLAGS='-maxtime 20m -timeout 10m' 2>&1 | tee /tmp/stress.log

Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1204603&tab=buildLog

The test failed on release-2.1:
	cluster.go:1267,cdc.go:633,cdc.go:133,cluster.go:1605,errgroup.go:57: /home/agent/work/.go/src/github.com/cockroachdb/cockroach/bin/roachprod run teamcity-1204603-cdc-sink-chaos-rangefeed-false:4 -- ./workload run tpcc --warehouses=100 --duration=30m --tolerate-errors {pgurl:1-3}  returned:
		stderr:
		
		stdout:
		       0            3.0            2.3     21.0     26.2     26.2     26.2 stockLevel
		   2m15s        0            2.0            2.2     35.7     56.6     56.6     56.6 delivery
		   2m15s        0           27.0           21.1     32.5     44.0     50.3     50.3 newOrder
		   2m15s        0            5.0            2.4      6.3      7.3      7.3      7.3 orderStatus
		   2m15s        0           29.0           23.1     15.2     17.8     23.1     23.1 payment
		   2m15s        0            2.0            2.3     22.0     24.1     24.1     24.1 stockLevel
		   2m16s        0            3.0            2.2     41.9     52.4     52.4     52.4 delivery
		   2m16s        0           24.0           21.1     29.4     37.7     37.7     37.7 newOrder
		   2m16s        0            1.0            2.4      7.1      7.1      7.1      7.1 orderStatus
		   2m16s        0           20.0           23.1     16.3     22.0     25.2     25.2 payment
		   2m16s        0            0.0            2.3      0.0      0.0      0.0      0.0 stockLevel
		: signal: killed
	cluster.go:1626,cdc.go:221,cdc.go:425,test.go:1216: unexpected status: failed

@danhhz
Copy link
Contributor

danhhz commented Mar 28, 2019

Didn't open up the logs, but I assume this latest failure is the same issue as #36019 (comment)

@danhhz danhhz changed the title roachtest: cdc/sink-chaos/rangefeed=false failed roachtest: cdc/sink-chaos/rangefeed=false failed [skipped] Apr 3, 2019
danhhz added a commit to danhhz/cockroach that referenced this issue Apr 15, 2019
For a while, the cdc/crdb-chaos and cdc/sink-chaos roachtests have been
failing because an error that should be marked as retryable wasn't. As a
result of the discussion in cockroachdb#35974, I tried switching from a whitelist
(retryable error) to a blacklist (terminal error) in cockroachdb#36132, but on
reflection this doesn't seem like a great idea. We added a safety net to
prevent false negatives from retrying indefinitely but it was
immediately apparent that this meant we needed to tune the retry loop
parameters. Better is to just do the due diligence of investigating the
errors that should be retried and retrying them.

The commit is intended for backport into 19.1 once it's baked for a bit.

Closes cockroachdb#35974
Closes cockroachdb#36018
Closes cockroachdb#36019
Closes cockroachdb#36432

Release note (bug fix): `CHANGEFEED` now retry instead of erroring in
more situations
craig bot pushed a commit that referenced this issue Apr 16, 2019
36804: sql/sem/pretty: use left alignment for column names in CREATE r=knz a=knz

Before:

```
CREATE TABLE t (
    name STRING,
    id INT8
       NOT NULL
       PRIMARY KEY
)
```

After:

```
CREATE TABLE t (
    name STRING,
    id   INT8
         NOT NULL
         PRIMARY KEY
)
```


36852: changefeedccl: switch retryable errors back to a whitelist r=nvanbenschoten a=danhhz

For a while, the cdc/crdb-chaos and cdc/sink-chaos roachtests have been
failing because an error that should be marked as retryable wasn't. As a
result of the discussion in #35974, I tried switching from a whitelist
(retryable error) to a blacklist (terminal error) in #36132, but on
reflection this doesn't seem like a great idea. We added a safety net to
prevent false negatives from retrying indefinitely but it was
immediately apparent that this meant we needed to tune the retry loop
parameters. Better is to just do the due diligence of investigating the
errors that should be retried and retrying them.

The commit is intended for backport into 19.1 once it's baked for a bit.

Closes #35974
Closes #36018
Closes #36019
Closes #36432

Release note (bug fix): `CHANGEFEED` now retry instead of erroring in
more situations

36872: coldata: fix Slice when slicing up to batch.Length() r=yuzefovich a=asubiotto

A panic occured because we weren't treating the end slice index as
exclusive, resulting in an out of bounds panic when attempting to slice
the nulls slice.

Release note: None

Co-authored-by: Raphael 'kena' Poss <knz@cockroachlabs.com>
Co-authored-by: Daniel Harrison <daniel.harrison@gmail.com>
Co-authored-by: Alfonso Subiotto Marqués <alfonso@cockroachlabs.com>
@craig craig bot closed this as completed in #36852 Apr 16, 2019
danhhz added a commit to danhhz/cockroach that referenced this issue Apr 24, 2019
For a while, the cdc/crdb-chaos and cdc/sink-chaos roachtests have been
failing because an error that should be marked as retryable wasn't. As a
result of the discussion in cockroachdb#35974, I tried switching from a whitelist
(retryable error) to a blacklist (terminal error) in cockroachdb#36132, but on
reflection this doesn't seem like a great idea. We added a safety net to
prevent false negatives from retrying indefinitely but it was
immediately apparent that this meant we needed to tune the retry loop
parameters. Better is to just do the due diligence of investigating the
errors that should be retried and retrying them.

The commit is intended for backport into 19.1 once it's baked for a bit.

Closes cockroachdb#35974
Closes cockroachdb#36018
Closes cockroachdb#36019
Closes cockroachdb#36432

Release note (bug fix): `CHANGEFEED` now retry instead of erroring in
more situations
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-test-failure Broken test (automatically or manually discovered). O-roachtest O-robot Originated from a bot.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants